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EMOTIONAL DISRUPTION AND INDUSTRIAL 
PRODUCTIVITY ° 


STANLEY SCHACHTER 
Columbia University 


LEON FESTINGER 


Stanford University 


The changes in working procedures re- 
quired by the introduction of new models or 
methods of construction are regular and quite 
frequent events in many industries, but the 
smoothness of the required transition from 
one working procedure to another is highly 
variable. Occasionally, rebalancing a line or 
introducing new methods of work proceeds 
with no difficulty. In a matter of hours or 
even minutes a work group may be operating 
as efficiently and productively as immediately 
prior to the change. More often, however, 
productivity or quality of work may drop 
precipitously after a change and it may re- 
quire days or weeks for a work group to reach 
the planned quality and production goals. 

The factors aifecting the smoothness of 
this sort of transition are undoubtedly mul- 
tiple. Engineering and planning practices play 
a major role, and supervisory practices un- 
doubtedly have an impact. Finally, there is 
increasing evidence that psychological factors 
affect the receptivity or resistance of the 
workers to a change. The purpose of the re- 
search to be described was to examine the 
effects of some of these psychological factors 
in experimental situations where it is possible 


1 These studies were sponsored by the Behavioral 
Research Service of the General Electric Company 
The authors are particularly grateful to L. L. Fergu- 
son, Director of the Behavioral Research Service, for 
his help in initiating, planning, and carrying out these 
experiments. We wish also to express our apprecia- 
tion to the managers and other operating personnel 
in each of the General Electric departments in which 
these studies were conducted for their cooperation 
and help in the execution of each of these experiments. 
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to rule out or control the possibly confounding 
effects of engineering, planning, and super- 
visory practices. 

For assembly line operators the motor skill 
difficulties involved in most changeovers ap- 
pear to be minimal since each job consists of 
a few basic movements repeated over and 
over again. Most rebalancings, insofar as they 
affect the individual operator, involve such 
minor modifications of a basically simple pat- 
tern of work that only a few minutes or 
hours of practice should restore the operator 
to his former level of skill. Yet weeks some- 
times go by before an operator returns to an 
acceptable level of quality and output. Since 
the difficulty of relearning alone cannot be 
considered an adequate explanation, it would 
seem that emotional and motivational factors 
may be heavily involved.* 

It is commonly accepted that emotional dis- 
turbance can seriously interfere with the per- 
formance of particular kinds of tasks. Such 
states as fear, rage, and hostility may se- 
riously disrupt the kind of coordination re- 
quired in the performance of an assembly 
operation. The question immediately arises, 
however, as to whether such emotional states 
are equally disruptive of the performance of 
all manual tasks of this sort or whether the 
degree of disruption varies with the nature of 
the task being performed. As a first step in 
consideration of this question, let us examine 
a dimension of motor activity which we shall 

“For a full discussion of group effects on work 


ers’ motivation to accept or resist change, 
Coch and French (1948). 
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call stereotopy. By stereotopy we refer to the 
extent to which behavior has an automatized 
or habitual character, requiring neither con- 
centration, attention, or thought. Examples of 
thoroughly stereotyped behaviors in adults 
would be walking, eating, dressing, and the 
like. It is perfectly possible to carry on such 
activities while paying virtually no attention 
to the actions involved. 

It is suggested that, once mastered, the 
typical industrial assembly operation is of 
precisely this stereotyped character.® It is an 
operation which may be repeated thousands 
of times daily and over the course of time 
undoubtedly becomes a thoroughly automa- 
tized pattern of motor activity. Consider now 
the effects of introducing a change in work 
procedure on this pattern of stereotyped be- 
havior. From a completely habitual set of 
actions, the job once more requires the op- 
erator’s attention. Probably even a trivial 
modification will require the operator’s con- 
centration; and, where previous to the intro- 
duction of a new work procedure an operator 
may have performed his job with dispatch 
and virtual automaticity, the change requires 


31f this characterization of the assembly process is 
correct, it facilitates a resolution of what has so 
long seemed a paradox—the fact that though an 
assembly job is precisely the kind of work that is 
so often considered tedious, demoralizing, and un- 
supportable, even casual observation of such a factory 
operation makes it apparent that this is simply not 
the case. The operators on such a line are usualiy 
relaxed and at ease, they converse casually and joke 
freely with one another, all the while doing their 
job. Interviews conducted with assembly workers in- 
dicate a relatively high degree of job satisfaction. 

Experimental studies of satiation provide some de- 
gree of insight into the process which makes this 
repetitious, automatic sort of job a bearable and 
perhaps even a pleasant one. In such experiments, 
subjects are required to work continuously at a 
repetitive task and their rate of boredom, fatigue, 
and satiation measured. Karsten (1928) demonstrated 
experimentally that when their task is an important 
one requiring the subjects to concentrate upon their 
activity, satiation sets in much more rapidly than 
when the task is a peripheral one to which the 
subject pays little attention. This would suggest that 
an assembly operation is bearable because it quickly 
becomes a peripheral, almost automatized or stereo- 
typed pattern of behavior like walking. And the 
observation that operators are able to perform their 
jobs while talking or daydreaming would certainly 
support this characterization of the assembly opera- 
tion as stereotyped behavior. 
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that he thinks about and be constantly aware 
of his movements. The effect of introducing 
a change, then, is to convert a stereotyped 
pattern of behavior into one which requires 
concentration, for a time at least, if the activ- 
ity is to be successfully carried out. 

Let us now consider the effects of states 
of emotional disturbance on behaviors of vary- 
ing degrees of stereotopy. It is suggested as 
our major hypothesis that emotional disturb- 
ance will be maximally disruptive of motor 
behaviors requiring thought and concentra- 
tion and minimally disruptive of thoroughly 
stereotyped behaviors. Translating this into 
terms directly applicable to the industrial 
problem under consideration, it is suggested 
that: 

1. When assembly operators are perform- 
ing their tasks in a stereotyped manner and 
no procedural changes are underway, the 
quality and quantity of production will be 
little affected by considerable variation in the 
emotional states of the operators. 

2. When changes in working procedures are 
introduced, emotionally disturbed workers will 
have considerably more difficulty making the 
transition than will relatively undisturbed 
operators and this will be reflected in reduced 
productivity. 

PROCEDURE 

The basic experiment conducted to test these 
hypotheses is, in essence, a simple one. Matched 
assembly line groups, long experienced at their jobs 
and performing identical operations were chosen 
as experimental groups. Over a period of several 
weeks, the experimentally “Disfavored” groups were 
systematically subjected to a series of common 
annoyances while the work situation of the ex- 
perimentally “Favored” groups was made as pleasant 
as possible. Following this period of manipulation 
of emotional state, an identical changeover in work 
procedure was introduced in the experimental groups. 
Throughout the course of the study detailed records 
were kept of the daily production of all of the 
groups. Comparison of the productivity of the ex- 
perimental groups allows evaluation of the effects 
of emotional state on workmanship during periods 
of normal operation when work is stereotyped and 
during periods of changeover when work requires 
close attention. 

Though the design of the experiment was simple, 
its realization was not. The sheer managerial com- 
plexity of training and coordinating the daily efforts 
of the manipulators and data collectors was such 
that it was simply not feasible to work with more 
than two pairs of experimental groups at a time. 
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Since the number of groups is too small to allow 
any considerable confidence in experimental differ- 
ences the experiment was replicated three 
times. The results of these three independent ex 
periments are presented in this paper. 


basic 


1. The first experiment was conducted in the Home 
Laundry Department of General Electric’s Appliance 
Park in Louisville, Kentucky. This study will be 
called Louisville I 

2. Another experiment was conducted in the 
Owensboro Tube Plant of General Electric’s Receiv- 
ing Tube Department. This study will be called 
Owensboro. 

3. A third experiment was conducted again in the 
Home Laundry Department of Appliance Park and 
will be called Louisville II. 


Though the details of each study necessarily vary 
from factory to factory the basic procedure is 
identical in the three experiments. We shall, in this 
section, describe the essential similarities. 


Experimental Groups 


In all experiments, the experimental groups con- 
sisted of small assembly groups varying in size from 
5-11 workers. All groups were of the classic as- 
sembly line pattern arranged so that a piece passed 
from worker to worker emerging at the end of 
the line as a finished or semifinished product. Though 
each worker’s job is different, he repeats precisely 
_the same set of operations on each piece on which 
> he works. In all of the experimental groups, the 
operations involved only small manual movements 
such as parts assembly, fine welding, connecting 
a wire to a lead, and the like. 

In each experiment, the comparison groups were 
identical assembly lines performing the same opera- 
tions and making the same product. Such groups 
matched as closely as possible in age, pro- 
ductivity, experience on the line, length of employ- 
ment with the company, and disciplinary record 
In all cases, attempts were made to choose experi- 
mental groups productivity records indi- 
cated that they had been working at a stable rate 
for, at least, several weeks prior to the beginning 
of the experiment and where a majority of the 
operators on each line had been at their particular 
station for more than 6 months. Such workers, then, 
had repeated their basic operations hundreds of 
thousands of times. 
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Manipulation of Emotional State 


It was the intent of the manipulations to make 
one set of groups disturbed and upset and the 
other set as happy as possible. In each experimental 
locale, the cooperation of local engineering and 
supervisory personnel was enlisted in devising and 
executing a stream of skits and manuevers designed 
to produce hostile and irritated states of mind in 
one case, and states of contentment and satisfaction 
in the other. The specific manipulations, of course, 
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varied from experimental locale to locale depending 
upon the nature of the job and upon local conditions 
To eliminate any aspect of artificiality, all the 
manipulations used were characteristic of and normal 
to the working life of each factory. However, they 
were intense and concentrated in a relatively short 
period of time. 

The procedures employed in the Louisville I study 
illustrate the general nature of these manipulations. 
For the Disfavored group over a 3-week period of 
manipulation, at least one annoying incident oc- 
curred on 11 of the 15 working days. These incidents 
centered around two themes: (a) a threatening and 
persistent time-motion study of the operators on 
the Disfavored line, (b) a persistent attack by 
quality-control and supervisory personnel on the 
quality of the workmanship on this line. In ad- 
dition, during this period of manipulation, a series 
of irritating incidents were precipitated or aggra- 
vated. For example, when washers were in short 
supply, the Disfavored operators were forced to sort 
out washers from a collection of greasy and dirty 
parts 

For the Favored groups, the chief techniques 
employed were praise and flattery of the high qual- 
ity of work on this line by engineering and 
supervisory personnel. In addition to praise, when- 
ever management personnel were in the area, they 
went out of their way to be friendly and helpful 
and to give credit for suggestions. Finally, a de- 
liberate and constant effort was made to prevent 
the occurrence of any irritating events so that for 
the 3 weeks of manipulations, life, for these groups, 
was relatively smooth, untroubled, and pleasant. 

In all three studies, the strategy of the manipula- 
tions was similar: maintain a continuous and _ per- 
sistent nagging at the Disfavored groups throughout 
the course of the manipulation period but concen- 
trate a flurry of annoying manipulations at the very 
end of the manipulation period so as to have the 
Disfavored groups as disrupted and angry as pos- 
sible on the day of changeover.4 The manipulations 
proper ceased completely the day the changeover 
started. Though ideally one would want to maintain 
maximum differences in emotional state between 
Favored and Disfavored groups all through the 
changeover period, we feared that inadvertently a 
manipulation of emotional state might artifactually 
affect productivity. Rather than take this chance 
we deliberately decided to end all manipulations of 
emotional state with the beginning of the change- 
over period. This feature of the experimental design 
does mean, however, that the emotional differences 
between the Favored and Disfavored groups prob- 
ably decreased during the course of the changeover 
period. 


4This same factor dictated that the changeover, 
in all studies, take place on a day in the middle of 
the work week. It seemed a reasonable hunch, that to 
allow a weekend to intervene between the final 
manipulations and the first day of changeover could 
only attenuate the impact of the manipulations. 
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After each of these manipulated incidents, the 
personnel involved dictated a detailed description 
of what they had done and what reactions they 
had evoked. In addition, the supervisory personnel 
involved in these studies kept a record of remarks 
and incidents relevant to the manipulations. This ma 
terial is used to evaluate the effectiveness of the ma- 
nipulations. 


The Changeover 


Following the weeks manipulation of emo- 
tional state, an identical change in work procedure 
was introduced for each pair of matched assembly 
groups. As far as the workers were concerned, these 
changes were coordinated to the production of a 
new product or the rebalancing of a line in order 
to produce a different number of units per day. 
A few days prior to the changeover, the operators 
were informed of the impending changeover and 
the reasons for it. For the first few hours of the 
day of changeover, the foreman and a trainer 
worked with the operators showing them 
the new procedures and then left the operators 
pretty much on their own. 

For most of the workers on these lines, their 
jobs were changed in some details. An example of 
the kind of change involved is the following. In the 
Owensboro study, the experimental groups shifted 
from the assembly of one vacuum tube type to 
another type. Before the change one of the op- 
erators had to drop and place one within another, 
two different grids in the small holes of a blank 
mica and then do the same operation with a 
cathode inside of the inner grid. After the change, 
this operator inserted only one grid and a cathode 
and dropped a plate over the grid. The difficulty 
of the changeover varied from station to station 
along a line but in most cases involved relatively 
minor changes in procedure. 


of 
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Measures 


The chief data collected in all three studies were, 
of course, data on the quantity and quality of 
production of each group. As standard factory pro- 
cedure, an hour-by-hour record is kept of the 
number of units produced by each assembly group. 
In addition to these data a daily record was kept 
of downtime (periods during the working day when 
for some reason the conveyor belt was stopped) 
for each experimental group. By correcting for 
downtime, we are able to obtain equivalent daily 
indices of productivity for each group. 

The means of collecting data on the quality of 
workmanship varied from study to study depending 
on the nature of the operation involved. In the 
Louisville I study, the two repair operators received 
all of the defective units produced on the line. 
These operators kept a daily systematic record of 
each defect on a form which permitted separation 
of defects into those caused by poor workmanship 
and those caused by defective parts. 
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In the Louisville II study, special operators were 
trained to inspect all of the units produced by the 
experimental groups. Again these operators kept a 
daily systematic record of all defective units and of 
their cause 

In the Owensboro study, all tubes produced on 
each line were tested on a machine which rejected 
defective tubes and automatically tabulated the 
nature of the defect. 

In addition to these basic data on production, 
other data were collected by means of observation 
and interviewing of the workers in some of the 
experimental groups. Where it is relevant, the tech- 
niques used will be presented in the body of this 
report. 


Sequence of Events 


In all three studies a common procedure and 
sequence of events was followed. After the experi- 
menters, in collaboration with the management of 
the plant involved, had selected the experimental 
groups, the workers in each of these groups were 
called together for a special meeting with one of the 
experimenters. The operators were told that a study 
would be conducted with several groups in the 
plant. The explanation of the presumed purpose of 
the study varied from experiment to experiment, 
but, in general, the studies were explained as an 
examination of factors affecting the maintenance 
of a good production rate. Such a study should 
be of value to the operators because what would 
be learned should help to keep average earnings up, 
and useful to the company for it should help in 
preventing delays and in meeting the increasing 
competition from various sources. The operators 
were, of course, told nothing of the true purpose 
of the study. The purpose of this meeting was to 
provide a rationale for the fairly constant presence 
of the experimenter or his staff on the factory floor 
and for the introduction of the new data collection 
procedures. 

A separate meeting was held with the foreman 
and supervisors of each of these groups. These 
people were told about the study in detail in order 
to insure that they would not inadvertently inter- 
fere with the manipulations or affect the outcome 
of the experiment. At this meeting, the roles of 
these supervisory personnel were explicitly defined 
for them. They were asked to be in the experi- 
mental work areas only when the normal require- 
ments of work demanded their presence, otherwise 
to stay away from these work areas. 

Following these orientation meetings a premanipu- 
lation period of approximately 2 weeks was _ insti- 
tuted. During this period, production data were 
systematically gathered to provide a normal base- 
line. At the end of this period the manipulation of 
emotional state began and, depending on the ex- 
periment, lasted from 2 to 4 weeks. At the end of 
this time, the manipulations were halted and the 
changeover introduced. Again depending on the ex- 
periment, postchangeover data were collected for 
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periods ranging from 1 to 4 weeks at which point 
the experiment was terminated. 


Apologia 


In planning, each of these studies was consider- 
ably more sophisticated than in realization. Such 
methodological niceties as counterbalancing the ef- 
fects of day and night shift operations, repeated 
interviewing of the experimental operators, and the 
like were carefully incorporated into the preliminary 
design of each study. However, the problems of 
attempting to keep tight control of the experi- 
mental situation in the hubbub of a major indus- 
trial operation forced us to abandon many of 
these niceties if we were to have any experiment 
at all. 

Once underway, extended field experiments of this 
sort are particularly vulnerable to disruption. Each 
of the experimental groups, though small, was a 
vital part of an extensive industrial operation. In 
the Louisville I study, for example, the 22 workers 
involved in the experimental produced a 
control unit which was an essential part of the 
washing machines built by their 1,700 col- 
leagues. Any serious interference with the work of 
the experimental groups could, therefore, disrupt 
the entire factory. This fact made it necessary to 
adhere to a rigid and fixed schedule of experimental 
operations. In preparing for the changeover, for 
example, it was necessary to plan weeks in advance 
to build up sufficient control units to 
guarantee that the factory could continue in opera- 
tion should the experimental groups fail to produce 
sufficient units on changeover. Such factors made 
the schedule of major experimental events almost 
completely inflexible and, therefore, unhappily sus- 
ceptible to chance factors which could ruin the ex- 
periment. For example, should a few of the ex- 
perimental operators be home sick on the crucial 
day of the changeover nothing could be done—the 
changeover could not be postponed, replacement op- 
erators would be put on the line, and the experiment 
would be ruined. These experiments carried 
out with superb cooperation from management but 
as with all such studies they are still subject to the 
disrupting chance factors characteristic of real life 
situations 

In one way or another each of the three studies 
was plagued by difficulties of this sort. The major 
disruptions were the following: The Owensboro 
study was originally planned with four pairs of 
matched experimental groups. On the day of change- 
over planned for two of these pairs, Kentucky 
suffered a very There was suffi- 
cient none of these 
worked. the third day 
changeover absenteeism was. still 


groups 


being 


storage of 


were 


severe snowstorm 
that 
second and 


absenteeism so groups 
following 
great and 
of these groups worked while others did not. By the 
fourth day, all working but enough 
replacement were involved to. still make 
full comparison impossible. This debacle, of course, 
forced us to discard the data of these four groups. 


some 


groups were 


operators 
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It should be noted, though, that the little data 
that do exist for these groups support the hypotheses 
of the study. 

A difficulty of 
Louisville I study 


another sort interfered with the 
Originally this study was planned 
for two pairs of matched work groups. Midway 
through the study, it was discovered that one 
of the Favored groups rotated jobs not only with 
one another but with members of other lines—a 
detail which made the productivity data collected 
on this group worthless. Although it might have 
been possible to have the foreman order the men 
to stick to their jobs, this could hardly have been 
considered an ideal event for a group that was 
presumably being favored and it was decided to 
drop this group and, of necessity, the Disfavored 
group with which it was matched, from the ex- 
periment. 


RESULTS 
Louisville I 


In the first study, the experimental groups 
consisted of two small assembly line whose 
job was the assembly of control units for 
washing machines. One of these groups worked 
on the day shift and the other on night shift. 
On each shift these were identically set-up 
line operations consisting of 11 operators, all 
but one of whom were women. The first five 
operators on these lines were involved en- 
tirely in assembly work, the following two 
operators in line were involved in assembly 
and testing operations, and the last four op- 
erators in testing, checking, and repair work. 
Only the first seven workers in each line are 
relevant to our study for these operators did 
all of the assembly work. 

Each of these lines was a paced-conveyor 
operation—that is, a conveyor belt traveling 
at a fixed speed, over which the operators 
had no control, carried the work mounts to 
each worker and the operator was forced to 
work at the pace set by this conveyor belt. 

The general nature of the manipulations 
of emotional state in this first study has been 
described in the preceding section. There is 
every indication that these manipulations 
successful in making the Disfavored 

(the day shift line) hostile, upset, 
and angry while the Favored group (the night 
shift line) remained relatively tranquil. The 
protocol dictated by the manipulators and 
supervisors of these people indicates again 
and again that the Disfavored subjects were 
disturbed by the manipulation. Angry com- 


were 
group 


= 
| 
i 
TY a 
| 
| 
4 
} 
: 
"4 
« 


206 


ments such as “Somebody is trying to cut our 
throats” were common. 

After exactly 3 weeks of manipulations, the 
changeover was introduced by means of re- 
balancing these lines. Before the change, each 
line was scheduled to produce 1,174 units 
per day. The new balance called for produc- 
tion of 1,044 units per day. This was effected 
by dropping one of the seven assembly work- 
ers from the line. Her job was redistributed 
among the remaining assemblers so that fol- 
lowing the change the jobs of five of the six 
remaining assemblers were changed in some 
procedural details. 

Following the change, the two groups re- 
mained at their new jobs for approximately 
4 weeks at which time the plant went on a 
plantwide one-shift operation and the study 
was over. 

Data on the quality of workmanship dur- 
ing the course of the study are presented 
in Table 1. The percentages in this table 
represent the proportion of the total number 
of units assembled which required repairs 
due to operator error. Each of these figures 
is based on the mean of those days during 
any particular experimental phase, for which 
data exist for both experimental groups. On 
many days one or more of the regular op- 


TABLE 1 
THE QUALITY OF WORK DURING THE 
LovisvILLE I ExperRIMENT 


Percentage of 
Assembled Units 
No. of Requiring Repair 
Matched 


Days 


Phase of the 
Experiment 
Dis 
favored 
Unit 


Favored 
Unit 


Premanipulation 
Early Manipulation 
Late Manipulation 


Changeover 
First Week? 
Second Week 
Reaminder 


21.1 
13.8 
11.6 


31.4 
28.0 
29.0 


«For reasons discussed earlier, the changeover always took 
place on a day in the middle of the week In this and all fol 
lowing tables the period called “First Week" includes the data 


from the day of changeover to the Friday of the same week, 
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erators was absent from her station. If it was 
impossible to find a thoroughly experienced 
replacement (as it almost always was) the 
data for this day’s work was automatically 
discarded. Since inspection of the dat’ made 
it clear that there was a high day-by-day 
correlation in the workmanship of the two 
groups (due to such factors as a run of bad 
parts) the data in this and all following tables 
are based wholly on the figures for those days 
for which the data exist for both experimental 
groups. The number of such days is presented 
in the column headed “Matched Days.’ It 
should be noted, though, that the magnitude 
of the differences between groups in this and 
all following tables is much the same whether 
the figures are based on matched days or on 
the total number of days for which data 
are available. 

It will be noted that there are no entries 
for the normal or premanipulation period. 
This unfortunate state of affairs is due to the 
fact that for most of this 11-day period, one 
or another of the four data collectors was 
home sick with the Asiatic flu. In order to 
give some notion of the pre-experimental 
difference between these two groups, we have 
divided the manipulation period into two 
phases: the early phase or first week of 
manipulations, at which time the differences 
in emotional disruption of the two groups 
must still have been relatively small, and 
a late manipulation phase or the last 2 weeks 
of the manipulation period. It will be noted 
that the two groups are virtually identical 
during the early manipulation period differing 
by only 1.2°¢. And for the late manipulation 
period, the two groups are still quite similar 
differing by only 3°. The evidence is good, 
then, in support of our first hypothesis. At 
a time that work is stereotyped, emotional 
state has virtually no effect on workmanship. 
A day-by-day comparison of these two groups 
during the total manipulation period indicates 
striking similarity between the two groups. 
For 8 of the 11 matched days the two groups 
differ by 2% or less. 

Following the changeover, the difference 
between the two groups is marked. In the 
first 3 days following the changeover, the 
Favored group increases its errors by 9.4% 
while the Disfavored group increases by 
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16.7%. In the following weeks the difference 
between the groups grows even larger for the 
Favored unit quickly returns to its prechange 
level of errors, while the Disfavored unit 
remains at a level almost twice that of its 
prechange level for the remainder of the 
study. For the entire manipulation period, the 
Disfavored group made 2.3% more errors 
than did the Favored group. For the entire 
postchangeover period, the Disfavored unit 
made 16.7% more errors than did the Favored 
unit. The evidence is good, then, in support 
of our second hypothesis—when changes in 
working procedures are introduced, emotion- 
ally disturbed workers will have more diffi- 
culty at their work than will relatively un- 
disturbed operators. 

So far we have considered the effect of 
the manipulations on the quality of work- 
manship. With respect to quantity of output, 
during the several phases of the experiment, 
these groups differed from one another by 
less than 1°% in the number of units pro- 
duced. Following the change, both groups 
came up to the scheduled production rate 
quite rapidly. Both groups were within a few 
units of scheduled production on the days 
immediately following the change and by the 
fifth day both groups steadily produced at 
the scheduled rate. 

This pattern of marked difference in qual- 
ity and no difference in quantity is hardly 
surprising when the specific nature of the 
operation is considered. It will be remembered 
that these groups worked at a paced conveyor 
operation. Except for emergencies when the 
line could be stopped, the operators had no 
control of the production rate. Any conspicu- 
number of missed units would bring 
managerial pressure to bear, and unless an 
operator was willing to precipitate a crisis, 
she had little choice but to do her job in, 
if necessary, a hasty fashion. 

Certainly the general theoretical considera- 
tions guiding this study would lead one to 
suspect that, in work situations where it is 
possible for quantity to vary, one should find 
evidence that both quantity and quality of 
work will vary with our manipulations. The 
Owensboro study was designed both to repli- 
cate the first experiment and to test this 
possibility. 


ous 


Owensboro 


The Owensboro study was conducted in a 
factory which manufactured vacuum tubes 
for electronic equipment. There were four ex- 
perimental groups. Two of these groups, 
known as cage units, were occupied with the 
assembly of the component parts of a tube. 
The remaining two units, known as weld 
units, were concerned with welding the prod- 
uct of the cage unit to a stem. Both pairs of 
units, of course, performed identical opera- 
tions. Each cage unit was composed of five 
female operators and each welding unit of 
eight female operators. All experimental units 
worked on the day shift. 

Both the cage and weld groups were, in 
good part, self-paced units. Unlike the Louis- 
ville I setup, the individual operators on these 
units could, within limits, establish their 
own rate of work. 

Production data were collected for 2!. 
weeks before the beginning of the manipula- 
tions. The manipulations themselves lasted 4 
weeks. The chief theme of the Disfavored 
manipulations céntered on the presumed dis- 
covery that the Disfavored units were re- 
sponsible for a contamination of tubes that 
resulted in the rejection of large numbers of 
the products of these units. The search for 
the precise source of this contamination 
allowed the manipulators to “legitimately” 
put these operators through a series of quite 
annoying incidents such as forcing the girls 
to wash their hands, to wear irritating special 
equipment, and the like. In addition, the 
work of these operators was disparaged and 
they were submitted to a continuous stream 
of annoyances of one sort or another. As 
in the previous study, the Favored manipula- 
tions were based largely on a combination 
of praise, flattery, and friendliness. Judging 
from the protocol dictated by the manipu- 
lators and supervisors, the manipulations were 
effective in producing the differential states 
of dissatisfaction required for the experiment. 

The changeover was made by changing the 
tube type on which these lines worked. This 
was a new tube for all of these operators and 
every 


girl’s job was changed. Productivity 
data were collected for approximately 4 weeks 
after the change. 
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TABLE 2 
THE QuaALity OF WorK IN THE CAGE UNITS 
DURING THE OWENSBORO EXPERIMENT 


Percentage of As- 
sembled Units 
Rejected by 


Phase of the | 4NO. Of | Testing Machine 
ange | Matched | 
Experiment 
| Favored 
| Unit Unit 
Premanipulation 4 | 041 0.24 
Manipulation | 13 | 0.27 0.27 
| 
Changeover 
First Week 3 0.18 0.95 
Second Week 5 0.52 0.54 
Remainder 6 0.43 0.15 


Let us examine, first, the data on the qual- 
ity of work which is presented in Table 2. 
This table presents data on the quality of 
work of the two cage units.° The figures in 
the table represent the proportion of the 
assembled unii: which are, owing to cage 
operator error, rejected by the testing ma- 
chine. Examining, first, the prechangeover 
data we note again strong support for the 
first hypothesis. Where the difference between 
the two groups is relatively small before the 
manipulations, it grows even smaller during 
the manipulation period. Clearly, emotional 
state has little impact on stereotyped work. 

Immediately following the changeover, the 
effects are similar to those in the Louisville 
I study. Where the two groups made identical 
proportions of errors before the change, after 
the change the Disfavored group makes more 


than five times the errors made by the 
Favored groups. Unlike the Louisville I 


study, however, the effect is brief for by the 
second week this difference between the two 
groups has vanished and in the final days 
of the experiment the two groups are back, 

®* Unfortunately, quality data on the weld units 
are not available. Unknown to the experimenters, 
the foreman added extra repair operators to the 
weld lines for the first few days of the changeover 
period. Since any quality differences might as well 
be due to differences between these repairmen as to 
differences in the experimental groups, the 
quality data must be treated as meaningless. 
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TABLE 3 
AVERAGE Howry Output IN THE CAGE UNITS DURING 
THE OWENSBORO EXPERIMENT 


| Percentage of 
| Production dur- 
ing Manipula- 


Average 
Hourly 
Production 


No. of tion Period 
Phase » 
of the | Matched 
Disfa 
vored Favored Dista 
nit Unit vored 
Premanipulation 5a 265 255 
Manipulation 13 293 288 
Changeover | 
First Week 4e | 143 120 48.8 41.7 
Second Week 5 | 168 158 57.3 54.9 
Remainder 5 210 204 71.7 70.8 


@ The number of matched days do not coincide exactly with 
those in Table 2. 
impossible to collect quality data for a particular day. 


Operational difficulties occasionally made it 


approximately, to their premanipulation levels 
of workmanship. We shall consider this dif- 
ference between the Louisville I and Owens- 
boro studies after presenting additional data. 

What about the quantity of output? Un- 
like the Louisville I operation, the setup 
in Owensboro did involve some degree of 
worker control of output. The relevant data 
are presented in Table 3 for the cage units 
and Table 4 for the weld units. The figures 
in the tables represent the average number 
of units produced per hour. In both tables 
we note precisely the same trends. In the 
premanipulation periods the Favored and Dis- 
favored groups in the two pairs of lines are 
fairly well matched in production. During the 


TABLE 4 
AVERAGE Hour.y Output IN THE WELD UNITS DURING 
THE OWENSBORO EXPERIMENT 


Percentage of 
Production dur 
ing Manipula 


Average 
Hourly 
Production 


Phase of the No. of tion Period 
Matched 
=xperiment 
Days 
Favored Disfa-) _ Disfa 
vored Favored 
nit Ynit | vored 
Premanipulation 7 223 207 | 
Manipulation 14 213 220 
Changeover | 
First Week 2 88 75 41.3 34.1 
Second Week 5 124 124 58.2 50.4 
Remainder 6 i4l 148 66.2 67.3 
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manipulation period the difference between 
matched groups remains small. Again the evi- 
dence is good that emotional disturbance has 
little effect on stereotyped behavior. 

Immediately following the change the two 
Favored units produce at a better rate than 
do their Disfavored counterparts. The differ- 
ence of 23 units in Table 3 and 13 units in 
Table 4 between Favored and Disfavored 
groups during the first week of change may 
not appear to be large but it should be kept 
in mind that these are hourly production fig- 
ures. Over an 8-hour working day during the 
first week of change, the Favored cage unit 
produced a daily 184 units more than did the 
Disfavored group compared to a daily 51-unit 
advantage for the Favored group during the 
combined manipulation and premanipulation 
periods. Similarly, the Favored weld unit had 
a daily advantage of 104 units over the Dis- 
favored weld units during the first changeover 
week compared to its daily 5-unit advantage 
during the combined manipulation and _ pre- 
manipulation periods. 

Correcting the absolute rate of production, 
by the prechange rate, it can be seen (lower 
right section of Tables 3 and 4) that during 
the first week of the changeover the Favored 
cage unit is producing at 48.8°> of the rate 
it was producing during the manipulation 
period, while the Disfavored cage unit pro- 
duces at 41.7% of its previous rate. A simi- 
lar difference exists for the two weld groups. 
It would appear, then, that our manipulations 
have affected both the quality and quantity 
of work at changeover times. 

As with the quality data, these effects on 
output are temporary. By the second week 
after changeover, the differences between 
Favored and Disfavored group grow quite 
small and by the third week the two pairs of 
groups are producing at virtually identical 
rates. Why are the effects in the Owensboro 
study temporary while those in the Louisville 
I study appear to be longlasting? An explana- 
tion for this difference seems to us reasonably 
apparent. 

It should be remembered, first, that owing 
to the paced-conveyor working arrangement, 
the experimental effects in the Louisville I 
study are limited to the quality of workman- 
ship while in the Owensboro study both qual- 
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ity and quantity are affected. If one examines 
the nature of the product produced in each of 
these operations, an explanation of the tem- 
porary vs. long-lasting effect seems reasonably 
clear. The Owensboro plant manufactured 
vacuum tubes, a product which is virtually 
nonrepairable once the mount has been sealed 
in its glass or metal housing. Since bad parts 
and errors in workmanship are inevitable, the 
plant maintains an extensive testing operation 
to insure that defective tubes are not shipped 
out to its customers. Every tube produced is 
tested and the nature of its defect catalogued 
and assigned to the particular line which pro- 
duced the tube. Of necessity, the plant is 
exquisitely quality conscious and every fore- 
man, supervisor, and line operator knows how 
many defective tubes they produced each day. 
Inevitably considerable pressure is brought to 
bear on lines which deviate from an accepted 
maximum of rejects. 

In the Louisville operation, on the other 
hand, a defective unit is easily remediable by 
a repair operator. Owing to this fact the plant 
kept no records of quality and the only per- 
sons in the entire plant who could have even 
a rough idea of the quality of workmanship 
on our experimental lines were the repair 
operators. Since these operators were appar- 
ently able to handle this extra work and 
appear to have said nothing about it, no 
pressure was put on the lines to remedy their 
workmanship. 

Our thesis, then, is that absence of super- 
visory pressure perpetuated the differences 
between the experimental lines in the Louis- 
ville I study. If this is correct, we should 
certainly expect that were the foreman to put 
pressure on the offending Louisville line, their 
rrors would decrease. And the evidence indi- 
cates that this is indeed the case. Four days 
before the end of the Louisville I study,” the 
experimenters asked the foreman of the pre- 
viously Disfavored group to pressure this 
group about the quality of their work. He 
simply went over to some of the assemblers 
on this line, told them that they were making 
too many more than the night 
shift, and asked them to try to improve. He 


errors, far 


The data for these 


Table 1. 


+ days are not included in 
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worked briefly with one of the operators sug- 
gesting changes in her work procedure and 
then left. In the period immediately preceding 
this interlude (the period called “Remainder” 
in Table 1) these operators had been pro- 
ducing 29.0% defective units. In the 4 days 
following the foreman’s intercession, they re- 
duced their level of errors to 17.3%, a figure 
only slightly greater than that during the 
manipulation period. It would appear, then, 
that the effects of our manipulations are, with 
time and effort, correctible. 

The Owensboro study, then, increases our 
confidence in the basic phenomenon under 
test. In two independent comparisons, the 
results of the Louisville I experiment repli- 
cate. In addition, the results of this study 
add two items of information to our knowl- 
edge of the phenomenon. First, emotional dis- 
turbance will affect the quantity as well as the 
quality of work. Second, the effects on work- 
manship appear to be remediable by super- 
visory pressure. 


Louisville II 


The Louisville II study was designed first 
as an additional replication of the basic ex- 
periment and, second, as a means of making 
additional methodologically required compari- 
sons. The reader has undoubtedly noted that 
before one can have any real confidence in 
the interpretation of the phenomena under 
study one would require further experimental 
control and comparison groups. It would be 
desirable, for example, to have a control 
group going through precisely the same se- 
quence of events but with no manipulation 
of emotional state. More importantly, perhaps, 
one would want some subjects who went 
through the manipulations of emotional state 
but who were not involved in a changeover. 
Perhaps the growing state of dissatisfaction 
has simply come to a head at changeover time 
and the disintegration of work is due to this 
factor rather than to the interaction of emo- 
tional state and the changeover. The Louis- 
ville II study was designed, in part, to make 
it possible to compare such subjects with 
subjects who were involved in a changeover. 

Four small assembly lines served as the 
experimental groups. Two of these lines were 


Festinger, and Hyman 


involved in the assembly of heater units for 
home laundry drying machines and two of the 
lines assembled motor-blower units for these 
same drying machines. There were five op- 
erators employed on each of the lines and as 
in the Louisville I study the lines were largely 
paced conveyor operations. The lines oper- 
ated on day and night shifts and the manipu- 
lations were so arranged that for the heater 
lines the day shift group was the Disfavored 
group and the night shift the Favored group. 
The reverse arrangement, of course, was made 
for the motor-blower lines. 

The changeover was so organized that on 
each line some of the jobs were changed and 
some were not. For each comparable pair of 
lines, of course, the same positions were 
changed. All told, the jobs of 10 of the 
operators were changed while 10 of the jobs 
remained the same. 

The sequence of events in this study was 
precisely the same as that in the preceding 
studies. Unhappily, at the time of this study, 
a particularly rigid schedule forced us to 
compress all phases of the study and we did 
not devote nearly as much time to the manipu- 
lation of emotional state. As a consequence 
there appeared to be, at best, only slight dif- 
ferences in emotional state between the Fa- 
vored and Disfavored groups. Protocol dic- 
tated by the manipulators and supervisory 
personnel involved gave no indications that 
the Disfavored groups were growing angry 
and no evidence of differential emotional dis- 
turbance between the experimental groups. In- 
terviews with and observation of individual 
operators on the line corroborated this im- 
pression, 

Although the manipulations had failed to 
produce groups that were differentially satis- 
fied and dissatisfied, it was possible to classify 
the operators, by means of ratings, into satis- 
fied and dissatisfied categories. During the 
study, the experimenters’ two assistants had 
interviewed all of these operators, talked with 
them extensively, and observed them at work 
daily. These two assistants independently 
rated each worker on a five-point “dissatisfac- 
tion” scale. Their ratings were correlated .84. 

To analyze the productivity data, each op- 
erator in the day shift was paired with his 
counterpart in the night shift. Thus, within 
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each pair the operators were matched for the 
kind of work they performed. The pairs of 
operators were sorted into two classes: one 
class contained those pairs whose jobs had 
been directly affected by the changeover, and 
the other class contained those pairs whose 
jobs were not changed. Then, the more dis- 
satisfied member of each pair was compared 
with his counterpart to see how each reacted 
to the changeover. 

Table 5 presents the data on the quality of 
work of operators whose jobs were changed. 
It will be noted that before the changeover 
the Satisfied workers consistently make more 
errors than do the Dissatisfied operators. 
After the changeover this is reversed with the 
Dissatisfied operators more than tripling their 
rate of errors while the Satisfied operators in- 
crease their errors by only some 50%. 

Similar data for those operators whose jobs 
were not changed is presented in Table 6. 
Before the changeover time, Dissatisfied op- 
erators make more errors than do Satisfied 
ones. After the changeover time, the difference 
between these two groups of operators is 
even smaller than before. Clearly, the states 
of satisfaction and dissatisfaction have af- 
fected the workmanship only of those opera- 
tors whose jobs were changed. 

Taken by themselves the trends noted in 
Tables 5 and 6 would, of course, be considered 
extremely tenuous because of the post hoc 
nature of the analysis. When considered along 
with the results of the previously described 


TABLE 5 


THe RELATIONSHIP OF WORKER SATISFACTION AND 
DissSATISFACTION TO THE QUALITY OF WORK DURING 
THE LovtsviLte I] Experiment 


(Workers whose jobs had been changed) 


Percentage of Assembled 
No. of Units Requiring Repair 
Matched 


Days 


Phase of the 
Experiment 
Satisfied 
Workers 


Dissatisfied 
Workers 


Premanipulation 3-4 1) 
1.39 


0.56 
Manipulation 0.83 
Changeover 


First Week 


TABLE 6 


Tue RELATIONSHIP OF WORKER SATISFACTION AND 
DISSATISFACTION TO THE QUALITY OF WORK DURING 
THE LoursviLLe II ExpeRIMENT 


(Workers whose jobs had not been changed) 


Percentage of Assembled 
No. of | Units Requiring Rework 
Matched 


Days 


Phase of the 

Experiment 

Dissatisfied 
Workers 


Satisfied 
Workers 


0.26 
0.27 


Premanipulation 3-4 
Manipulation 8-11 


1.06 


Changeover 


First Week 


studies, however, we can have considerably 
more confidence in these data and, indeed, 
they add one new item to our knowledge of 
the phenomena under consideration. A state 
of dissatisfaction in a worker, whether created 
experimentally or already in existence, will 
adversely affect his productivity only when 
his work is not stereotyped. 


DISCUSSION 


Though the results of any single one of 
these experiments must be treated as a case 
study owing to the small number of cases 
involved, the three experiments viewed to- 
gether do give us considerably more confidence 
in the hypotheses under test. In three experi- 
ments, involving four independent compari- 
sons, we find precisely the same pattern of 
results. Emotional disturbance has little effect 
on stereotyped activity, but does have a dis- 
rupting effect on nonstereotyped activity. 

Though the basic phenomena under test 
seem replicable and reasonably well estab- 
lished, we must admit that our understanding 
of the phenomena is, at best, crude and that 
our theoretical statement is a loose, though 
plausible, formulation which is married only 
roughly to the experimental test situation. 
At almost every step in this formulation we 
have made assumptions for which there is 
relatively little external support. We have, for 
example, assumed that the assembly operation 
during regular production periods is stereo- 
typed activity and during changeover times 
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is not. Is this correct? Happily, on this point, 
relevant evidence is available. If this assump- 
tion is correct, we should expect that during 
regular operations the thoughts and conversa- 
tion of the workers will be largely irrelevant 
to the job at hand while during changeover 
times they will be much more concerned with 
their work. In the Louisville I study, a trained 
observer observed each of the experimental 
groups at work for half-hour periods every 
second working day throughout the 10 weeks 
of the study. Among other things, he cate- 
gorized all interactions as either work related 
(relevant to the job at hand or anything to 
do with General Electric) or nonwork related 
(kidding around, remarks about politics, per- 
sonal matters, weather, etc.). In Table 7 we 
have recorded the proportion of operator in- 
teraction that was relevant to work during 
the course of the study. The average number 
of interactions recorded during the half-hour 
observation periods over the course of the 
study was 36.8 for the Favored group and 
31.9 for the Disfavored group. 

It can readily be seen in this table that 
during the manipulation period interaction 
among the operators was predominantly ir- 
relevant to the job. In the period immediately 
following the changeover the proportion of 
interaction concerning work increased mark- 
edly, and then decreased steadily to pre- 
changeover levels. To the extent that this 
categorization of interaction is acceptable as 
an indirect index of stereotopy, our characteri- 
zation of the assembly process may be consid- 
ered as reasonable and supported by the data. 

Perhaps the chief ambiguity in this theo- 
retical scheme is the precise nature of the 
presumed link between emotional disturbance 


TABLE 7 
THE PERCENTAGE OF OPERATOR INTERACTION 
THAT WAs RELEVANT TO Work 


‘ After Changeover 
Manipu- 
lation 
Period 


Group 
Days 
1-7 


Favored 
Disfavored 


and nonstereotyped behavior. Such a link has 
been hypothesized but in no way has the 
mechanism of this relationship been elabo- 
rated. At this stage of investigation, many 
alternative explanations are possible. For ex- 
ample, it may be that the chief effect of the 
emotionally disrupting manipulations has been 
to diminish the motivation to do a really good 
job. Such an effect would be unlikely to affect 
already stereotyped behavior but might very 
well interfere with the acquisition of good, 
new work habits. Alternatively, it is possible 
that the state of fatigue consequent on the 
repetition of a task requiring concentration is 
chiefly responsible. Possibly when the indi- 
vidual is in such a state of fatigue or ex- 
haustion, his emotions and irritations are par- 
ticularly liable to affect his behavior. When 
behavior is stereotyped, fatigue is less likely 
and performance should not be affected. When 
repetitious behavior is nonstereotyped, the 
interaction of emotional disturbance and 
fatigue may lead to particularly deteriorated 
performance and the sheer repetition of such 
deteriorated behavior may again lead to the 
automatization of ineffective work habits. Still 
other alternatives are possible and it is clear 
that the precise understanding of the phe- 
nomena here demonstrated demands experi- 
mentation specifically directed to clarification 
of this relationship. 

Numerous other questions, of course, open 
up as well. For example, are these effects spe- 
cific to only disturbing emotional states such 
as the anger and hostility which we manipu- 
lated or do they generalize to such strong, but 
benevolent, states as joy and euphoria? This 
question and others are currently under in- 
vestigation in laboratory experiments being 
conducted by J. Arrowood, B. Latané, and B. 
Schuler (1960 unpublished). 


SUMMARY 


Though the introduction of new work pro- 
cedures is a frequent event in many industries, 


the smoothness of the transition from one 
working procedure to another is usually un- 
predictable. Sometimes rebalancing an as- 
sembly line proceeds with no difficulty; at 
other times productivity drops precipitously 
after a change and it requires weeks for a 
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work group to reach expected quality and pro- 
duction goals. Engineering, planning, super- 
visory, and psychological factors are all in- 
volved in such a change, and it was the pur- 
pose of this study experimentally to examine 
the effects of emotional factors on the success 
of a planned change. 

It is commonly assumed that emotional 
states such as anger and hostility are dis- 
ruptive of performance. It is here hypothesized 
that such emotional states will be maximally 
disruptive of behavior that requires thought 
and concentration; but will have little effect 
on behaviors that are stereotyped, that is, be- 
haviors such as walking or eating that are so 
well mastered and habitual that they require 
neither attention nor thought. It is suggested 
that, once mastered, the typical assembly op- 
eration is of precisely this stereotyped char- 
acter. The effect of introducing a change in 
working procedure is to convert the assembly 
operation from a completely stereotyped op- 
eration into one which requires, for a time, 
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complete concentration. This analysis suggests 
the following hypotheses: 

1. During regular factory operations, when 
no procedural changes are underway, the qual- 
ity and quantity of production will be little 
affected by wide variations in emotional states 
disturbing the operators. 

2. At times when changes in working pro- 
cedure are being introduced, emotionally dis- 
turbed workers will have more difficulty 
making the transition than will relatively un- 
disturbed operators. 

To test these hypotheses, three independent 
field experiments were conducted on assembly 
groups working in General Electric factories. 
The results of all three studies support the 
hypotheses. 
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TARGET TRACKING AND ACQUISITION IN THREE 
DIMENSIONS USING A TWO-DIMENSIONAL 
DISPLAY SURFACE‘ 


CHARLES S. MORRILL ano BARBARA L. 


DAVIES 


MITRE Corporation 


A great deal of experimentation has been 
reported concerning display-control com- 
patibility and its effect upon operator per- 
formance during target acquisition and target 
tracking tasks (Andreas, Finck, Green, 
Smith, & Spragg, 1959; Ellson, 1947; Ely, 
Thomson, & Orlansky, 1956; Fitts, 1951; 
Tufts College, 1952, Part 6, Ch. 3, Sect. 4). 
Most of this past work is concerned with 
display movement in only two dimensions. 
On the other hand, operator difficulties en- 
countered in tarket tracking and/or acqui- 
sition in three dimensions, azimuth, range, 
and elevation, on a two-dimensional display 
surface, remain relatively unexplored. The 
objective of the present study is to investi- 
gate the effects of four different display- 
control systems upon operator performance 
during target acquisition and target tracking 
tasks using three dimensions. 

In this study target azimuth and range 
were represented by a symbol (a single dot) 
generated by one channel of a Dumont 
dual-beam scope and capable of moving 
along the x and y axes simultaneously. 
Target elevation was represented by a sym- 
bol (a short vertical line) generated by the 
second scope channel and capable of vertical 
movement along the right-hand strip of the 
display surface. The display symbols which 
represented the hand-control positions will 
be referred to as the azimuth-range symbol 
and the elevation symbol. A static display is 
shown in Figure 1. 

Based on a similar display-control model 
for tracking in three dimensions, a recom- 
mendation has been made by Dunlap and 
Associates (1957) for a hand-control design 


1The research reported in this article was sup- 
ported by the Department of the Air Force under 


Air Force Contract AF-33(600)39852. The original 
version of this article was published as a MITRE 
Technical Series Report, January 1961. 
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in an airborne radar system. This recommen- 
dation suggests a display-control configura- 
tion which is a functional represéntation of 
the information. As a result, an incompatibil- 
ity exists between the direction of the dis- 
played motion of the azimuth-range symbol 
in range and the elevation symbol on the dis- 
play surface; i.e., a forward motion of the 
hand control results in an upward motion of 
the azimuth-range symbol, while a similar for- 
ward motion of the thumbwheel results in a 
downward motion of the elevation symbol. 

A survey carried out by Morrill and 
Sprague (1960) indicated that there is an 
over-all preference for a display which is a 
direct representation of the hand-control 
movements. This direct representation em- 
ploys a compatible system rather than the 
incompatible system which resulted from 
Dunlap’s functional representation of the 
equipment. (A system is compatible when 
identical directional movements of the hand 
control and the thumbwheel produce move- 
ments of the azimuth-range symbol and the 
elevation symbol that are the same in di- 
rection on the scope.) Furthermore, one group 
of subjects indicated a direction-of-movement 
preference, namely, a backward motion of the 
hand control and thumbwheel to produce up- 
ward movements of the symbols of the dis- 
play. Subjects in the survey study were asked 
to state their display-control preferences in 
order to acquire specific targets, whereas this 
manuscript reports actual performance dur- 
ing target acquisition and target tracking 
tasks when both compatible and incompatible 
display-control relationships were used. 

The results of the study described in the 
present report are generally applicable in 
systems which require information concern- 
ing the display-control relationships appro- 
priate for the operation of dynamic controls 
and in systems which provide for manual 
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Fic. 1. Static display. 

hand control movements as either the pri- 
mary or the back-up mode of operation. For 
example, in the design of the display and 
control portions of an airborne radar sys- 
tem, the data from this study may serve to 
answer questions which arise regarding the 
compatibility of the movements of a hand 
control with the corresponding movements 
of the display symbols. 


METHOD 
Subjects 
The 


irom 
Force 


Four groups of 10 subjects each were used. 
subjects were Air Force enlisted personnel 
the 3245th AC&W Squadron, Hanscom Air 
Base, Bedford, Massachusetts. 


Tasks 


Each subject performed two tasks. During Task 
I, the subject was instructed to acquire the target 
(i.e., to achieve lock-on) as quickly and accurately 
as possible. This task required that the 
place the azimuth-range symbol (a small 
around the target and similarly superimpose 


subject 
circle) 
the 
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elevation symbol on the target. Each subject was 
given four practice trials prior to 64 test 
Task II, for each subject, was to track a moving 
target in azimuth, range, and elevation for a period 
of 5 minutes during each of four trials. 

Groups A and B used compatible display-control 
systems for both tasks, whereas Groups C and D 
used incompatible display-control systems. The dis- 
play-control systems that were used are listed in 
Table 1. 


trials 


Apparatus 


Each subject was seated in front of a dual-beam 
cathode-ray-tube display (CRT) mounted on a 
console. Also on the console there appeared tour 
lights, one each to indicate coincidence for the 
individual dimensions and one to indicate the 
achievement of lock-on. Coincidence was defined 
as that period of time during which the target and 
the hand control had equivalent positions for that 
particular parameter. Lock-on was defined as that 
period of time during which there was coinci- 
dence in azimuth, range, and elevation simultane 
ously. When lock-on was achieved, all four lights 
were lit. 

The monitor's console, a CRT, was mounted 
with three clocks, which provided a measure of the 
time during which a subject achieved 
in the parameters of azimuth, range, and elevation 
From another clock mounted next to the target 
position selection panel,.data could be obtained 
concerning the time necessary to lock-on 
or the duration of lock-on to the target. During 
the target acquisition phase of the experiment, 
Task I, the experimenter inserted the desired target 
parameters by means of the push-button target 
position selection panel, there being eight possible 
selections for each of the three 
azimuth, range, and elevation. The 
formance in azimuth and range 
by means of the monitor’s scope. A scope 
provided the experimenter with information con- 
cerning the subjects’ performance in the elevation 
dimension. In addition, during the target acquisi 
tion phase initial reversals, defined as initial move- 
ments in range and elevation which 
direction opposite to where the 
on the scope, were recorded by 
born recorder. 
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Fic. 2. Hand control. 


The hand control used by the subjects in this 
study, as shown in Figure 2, was a handgrip with 
a serrated wheel mounted at the top. Pivotal rota- 
tion of the handgrip to the left and right or 
forward and backward produced movement of the 
azimuth-range symbol (a small circle), respectively, 
in azimuth and range. Forward and _ backward 
rotation of the thumbwheel produced movement of 
the elevation symbol. 

Pressure on the trigger of the hand control acti- 
vated the display system. The ratios of movement 
of the hand control and thumbwheel to movement 
of the azimuth-range and elevation symbols were 
as follows: (a) +1° movement of the thumbwheel 
produced +1 millimeter displacement of the eleva- 
tion symbol; (b) = 1° movement of the hand con 
trol in the azimuthal and range directions produced, 
respectively, £1.33 millimeter displacement of the 
azimuth-range symbol in azimuth and range 

The dynamic target was produced by using two, 
low frequency function generators. The azimuth 
signal was a triangular wave generated at a fre 
quency of .02 cycles per second; the range and 
elevation signals were triangular waves generated 
at a frequency of .01 cycles per second 


Target Selection 


As noted above, 64 trials were given to each 
subject during Task I. The equipment was con- 
structed with push-button, target position selectors, 
eight each for azimuth, range, and elevation, so 
that 512 different target positions were possible 
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The objective oi this experiment was to investigate 
display-control compatibility in range and elevation 
only; thus, 64 trials were administered to each 
subject in order to include all of the 64 range- 
elevation combinations. 

Range. Range was measured from the horizontal 
diameter of the scope with four range positions 
above this diameter and four range positions below 
Absolute deviation was measured as a_ perpendicu- 
lar drop to the diameter. The two range positions 
which appeared nearest the diameter, one aboye 
and one below, were assigned an ci 
tion level of I. Successive range positions above 
and below the diameter were assigned absolute- 
deviation levels of II, II, and IV, respectively. 

Elevation. An initial point in elevation was de- 
fined as a center point with four elevation positions 
above and four below this origin. Absolute devia- 
tion from the elevation origin was assigned in 
exactly the same manner as described above for 
range. 

Azimuth. Four absolute-deviation levels were as- 
signed to azimuth in the same manner as outlined 
for range. Absolute deviation in this case, however, 
Was measured as the perpendicular distance to the 
azimuthal diameter. 

Sixteen range-elevation deviation combinations 
resulted from considering all possible pairings of 
the range and elevation absolute-deviation levels. 
Four azimuthal positions, two from the left and 
two from the right, were selected for each of these 
16 combinations. Each of the four absolute-devia- 
tion levels in azimuth was represented once for 
each of the 16 range-elevation pairings. Also, for 
each range-elevation pairing, the composite devia- 
tion level from the azimuthal diameter of the two 
left azimuthal positions was equal to the com- 
posite deviation level of the two right azimuthal 
positions. A total of 64 trials resulted, with the 
azimuth parameter equally distributed among the 
range-elevation deviation combinations. 


Scoring 


During Task I, two measures were recorded 
time to acquire lock-on and composite initial re- 
versals, defined as the sum of initial reversals in 
range and elevation. An initial reversal made in 
order to achieve coincidence in range was consid- 
ered an error. Likewise, an error was recorded if 
an initial reversal was made to achieve coincidence 
in elevation. During Task II, lock-on time and 
the length of time of coincidence in each of the 
dimensions were recorded 


RESULTS AND DiscussIon 
Task I 


Four groups of subjects were asked to ac- 
quire targets in azimuth, range, and eleva- 
tion as quickly and accurately as possible. 
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Each group used a different display-control 
system. 

An analysis of variance was carried out to 
investigate differences among the four dis- 
play-control systems in terms of composite 
initial reversals in range and elevation. A 
statistically significant difference at the .005 
confidence level was found among the four 
groups. Bartlett’s test of homogeneity of 
variance was carried out among the four 
groups. No statistically significant difference 
was found. 

In Table 2 are shown the results of 
Tukey’s test (Bowker & Lieberman, 1959) 
to determine for which pairs among the four 
groups there was a statistically significant dif- 
ference between composite initial reversals. 
Only Groups B and D differed significantly. 
The performance of Group B, whose subjects 
used the display-control relationship which 
required pulling back on the hand control 
and thumbwheel to produce a downward mo- 
tion on the scope, was superior to the per- 
formance of Group D, whose members were 
required to pull back on the hand control 
in order to produce upward movement and 


to pull back on the thumbwheel to produce 
downward movement on the scope. 
Further analyses of variance were carried 


out to determine if for this sample the 
initial reversals in range and elevation con- 
tributed equally to the difference among the 
groups. The results indicated that there was 
a statistically significant difference for initial 
range reversals at the .005 level of confi- 
dence, but that there was not a significant 
difference among the groups in terms of 
initial elevation reversals. These data sup- 
port the notion that at least for this sample 
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the differences observed among groups were 
attributable to initial reversals with the range 
control (forward and backward stick move- 
ment) rather than to initial reversals with 
the elevation control (thumbwheel). One ex- 
planation of the significance of initial re- 
versals in range is found in the method which 
the subjects employed to acquire the target. 
In most cases, subjects moved the azimuth- 
range control before they moved the eleva- 
tion wheel. If the subject had learned a 
particular display-control system and _ yet 
made an initial reversal in range, the in- 
formation gained from the reversal action 
would aid him in achieving the correct re- 
sponse in elevation. 

The previously mentioned survey by Mor- 
rill and Sprague (1960) pointed out that 
there existed a preference for a compatible 
display-control system, i.e., for a forward 
or backward motion of both the hand con- 
trol and the thumbwheel to produce the 
same directional movements on the scope. 
Task | the present study, which re- 
quired the subjects to carry out a target 
acquisiton task, seemingly does not support 
these previous findings. Both Group A and 
Group B used internally compatible display- 
control systems for range and elevation, while 
Group C and Group D used internally in- 
compatible systems for range and elevation; 
vet, the only statistically significant differ- 
in terms of inftial reversals was _ be- 
tween one compatible and one incompatible 
system. There were no statistically sig- 
nificant differences between any of the other 
pairs. Further examination of the data pro- 
vided an explanation for the unique differ- 
ence between Group B and Group D. The 
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two important factors in this explanation 
are: the compatiblity of range and elevation 
movements within each display-control sys- 
tem and the specific display-control relation- 
ships for range. Groups A and B used dis- 
play-control systems that were internally 
compatible in terms of movements associated 
with both range and elevation; no difference, 
then, would be expected between these two 
groups. Groups C and D both used systems 
that were incompatible within themselves; 
thus again no difference would be expected. 
Although Group A used an internally com- 
patible system and Group D did not, both 
groups used the same configuration to con- 
trol range. Since the effect of initial range 
reversals proved to be significant whereas 
the effect of initial elevation reversals did 
not, the lack of difference in composite initial 
reversals between Groups A and D may 
well be attributed to the similarity of their 
range configurations. This explanation is also 
appropriate for the lack of difference between 
Groups B and C. 

The performance of Group B was superior 
to that of Group D. Group B used a display- 
control system which was internally com- 
patible: that is, a backward motion of both 
the hand control and the thumbwheel pro- 
duced the same directional movements oi the 
azimuth-range and elevation symbols on the 
scope. Furthermore, for Group B, a_back- 
ward movement of the hand control pro- 
duced a downward movement of the azimuth- 
range symbol. Since the effects of initial 
range reversals proved to be significant, 
whereas the effect of initial elevation rever- 
sals did not, it appears that the configuration 
used by Group B to control range was pre- 
ferred to the one used by Group D. Perhaps 
stimulus-response compatibility for range and 
elevation within a display-control system 
and, in addition, a second-order interaction 
of a specific stimulus-response configuration 
for range, namely, a backward movement of 
the hand control to produce a downward 
movement of the azimuth-range symbol, 
must be operative for one group and neither 
of these conditions operative for another 
group in order to produce a statistically sig- 
nificant difference between the groups. Group 
B had both conditions operating—internal 
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compatibility and the preferred range con- 
figuration—-whereas Group D_ had _ neither. 
Thus a difference was demonstrated. Con- 
cerning Groups A and C, although Group A 
employed an internally compatible system 
and Group C did not, Group C, but not 
Group A, utilized the preferred range con- 
figuration. In this case, therefore, no differ- 
ence was found. 

An analysis of variance was carried out 
to determine if there was a signficant differ- 
ence among the four groups in terms of 
time to acquire lock-on during Task I. No 
statistically significant difference was found. 


Task Il 


In Table 3 are shown the results of an 
analysis of variance in terms of time in sec- 
onds that lock-on to the target was achieved 
during a total of four 5-minute trials. There 
was a Statistically significant difference 
among the groups at the .05 confidence level 
and between trials at the .001 confidence 
level. Bartlett’s test of homogeneity of vari- 
ance was carried out among the four groups. 
No statistically significant difference was 
found. 

In Table 4 are shown the results of Tu- 
key’s test to determine for which pairs among 
the four groups there was a statistically sig- 
nificant difference between total mean times 
in seconds of lock-on to the target. Groups A 


TABLE 3 
ANALYSIS OF VARIANCE FOR TIME OF 
Lock-On to Tarcet—Task II 
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TABLE 4 


MEAN TIMES 


Mean Time of 


Groups Lock-on 


and C were significantly different from each 
other at the .05 confidence level. 

With the exception of the pairings of 
Groups B and D and Groups A and C, the 
reasons to explain the absence of statistically 
significant differences between pairings of 
groups are the same as for target acquisition. 
Concerning Groups A and C and Groups 
B and D, the results support the explanation 
that during the tracking task, as during the 
acquisition task, the stimulus-response com- 
patibility for range and elevation within a 
display-control system and, in addition, a 


second-order interaction of a specific stimu- 


lus-response configuration for range must 
be operative for one group and neither of 
these conditions operative for another group 
in order to produce a statistically significant 
difference between the 
contrast to the target acquisition task, the 
preferred range configuration for tracking 
was a backward movement of the hand con- 
trol to produce an upward movement of the 
azimuth-range symbol. During the tracking 
task, Group A had both conditions operating 


groups. However, in 
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—internal compatibility and the preferred 
range configuration—whereas Group C had 
neither. Furthermore, while Group B em- 
ployed an internally compatible system and 
Group D did not, Group D, but not Group 
B, utilized the preferred range configuration. 
Thus, a_ difference was found between 
Groups A and C and not between Groups 
B and D. The reason why subjects preferred 
one display-control configuration for range 
during target acquisition and a different dis- 
play-control configuration for range during 
target tracking remains an open question. 
Table 3 showed a statistically significant 
difference between trials. For all groups 
combined in Task IT, trials were compared 
to see if practice improved performance dur- 
ing the four trial sessions, with improvement 
measured by mean time in seconds of lock- 
on to the target. By using Tukey’s method 
to determine which pairs among the four 
trials differed, it was found that there was 
a statistically significant difference at the 
.O1 level between all trial means except be- 
tween Trials II and III, where the differ- 
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Mean times of lock-on to target by 
groups and trials—Task II. 


ence was significant at the .05 level. These 
data and standard deviations associated with 
trial means are presented in Table 5. Ex- 
amination of these data show that the sub- 
jects in all groups were in fact able to im- 
prove performance as a function of training. 
The duration and number of the trials, 5 
minutes for each of four trials, and the in- 
tensity of the task might have had an ad- 
verse effect upon performance. The positive 
effect of practice, however, appears to be 
more dominant: than performance degrada- 
tion by fatigue. A pictorial representation, 
by groups and trials, of mean time of lock-on 
to the target is given in Figure 3. 


SUMMARY 


Analyses were carried out to determine the 
effects of two internally compatible and two 
internally incompatible display-control sys- 
tems upon operator performance during tar- 
get acquisition and target tracking tasks 
using three dimensions of information pre- 
sented on a two-dimensional display surface. 
The results of this study suggest the follow- 
ing concluions : 


1. During target acquisition, the display- 
control relationship which required that the 
subjects pull back on the hand control and 
thumbwheel to produce a downward motion 
on the scope was superior to the control 
which required that subjects pull back on 
the hand control to produce an upward 
movement and pull back on the thumbwheel 
to produce a downward movement on the 
scope. The difference in composite initial 
reversals which was found was attributable 
to stimulus-response compatibility of the 
range and elevation controls within a display- 
control system and to a second-order inter- 
action attributable to a specific display- 
control relationship for range, namely, a 


backward movement of the hand control to 
produce a downward movement of the azi- 
muth-range symbol. 


2. During target tracking, the display-con- 
trol relationship which required that the sub- 
jects pull back on the hand control and 
thumbwheel to produte an upward motion 
on the scope was superior to the control 
which required that the subjects pull back 
on the hand control to produce a downward 
movement and pull back on the thumbwheel 
to produce an upward movement on the 
scope. The difference in time of lock-on 
which was found was attributable to the 
stimulus-response compatibility of the range 
and elevation controls within a display-con- 
trol system and to a second-order interaction 
attributable to a specific display-control re- 
lationship for range, namely, a backward 
movement of the hand control to produce 
an upward movement of the azimuth-range 
symbol. 
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CONSUMER VERSUS MANAGEMENT REACTION 
IN NEW PACKAGE DEVELOPMENT 


MILTON L. BLUM ann VALENTINE APPEL 


Marketing, Merchandising and Research, Incorporated 


The last decade has been one of radical 
change in packaging as it has in many other 
areas of commercial life. A real part of this 
packaging revolution has been the contribu- 
tion which consumer research has made. In 
fact, the introduction of a new package 
without the benefit of consumer research 
evalution is becoming the exception rather 
than the rule. The study to be reported 
points up the importance of consumer re- 
search in such package development pro- 
grams. 

The writers’ firm was engaged to conduct 
a preliminary packaging study for one of 
its clients. The client’s objective was to de- 
velop a package for a new product line. The 
product was intended for use by men but to 
be bought by women as a gift. 

The purpose of the study was to screen, 
from a group of 18 design renderings sub- 
mitted, the designs which showed the most 
promise, and to indicate possible areas of 
design modification which might further im- 
prove the acceptance of the more promising 
of the design concepts. The principal inten- 
tion was ultimately to evaluate the more 
promising designs further based upon three 
dimensional prototypes and larger samples of 
respondents. 

Earlier research had detailed certain 
specifications which the ideal package should 
meet. Among these was the decision that the 
package should appear both as masculine and 
relatively expensive. Moreover, women should 
prefer it as a gift for their husbands, and men 
should prefer to receive it as a gift for them- 
selves. 


The study was unusual in that not only 


consumers were interviewed. The client’s 
management, and also the design firm which 
created the packages, agreed to evaluate the 
designs from what they considered to be the 
female consumer’s point of view. There was, 
therefore, the opportunity to compare the 


judgments and 


consumers. 


of designers, management, 


MetTHOop 


The study employed four independent groups of 
raters: female consumers (N= 80), male 
sumers (N= 39), advertising and marketing ex- 
ecutives of the client company (N= 8), and the 
industrial designers who created the packages (N = 
7). Each of the members of these groups indi- 
vidually rated a total of 18 different package de- 
sign renderings using Stephenson’s (1953) Q sort 
technique. The 18 designs were rated in terms of 
the extent to which each design was perceived as: 
masculine or feminine, expensive or inexpensive, 
and appropriate or inappropriate as a male gift. 


con- 


The Q sort was 
spondent to arrange the 
scaled categories, each category being assigned a 
ranging from one to seven. For each re- 
spondent, this resulted in a forced frequency distri- 
bution of scores for the 18 designs. This frequency 
distribution was perfectly symmetrical, approached 
normality in shape, and had a modal rating of 
four which was assigned to six of the 18 designs 
The forced distribution and the 
assigned to were as follows: 


asking the re- 
renderings into seven 


performed by 


score 


frequency 
each 


scores 
category 


Frequency 1 1 2 3 6 


Score 1 2 3 4 5 6 


The advertising and marketing executives of the 
client company, and the members of the design 
firm Q sorted the same 18 designs only on the 
basis of the extent to which they believed that 
women would be willing to give each of the pack- 
ages to their husbands as a gift. This made for a 
total of eight variables to be analyzed; three each 
for the male and female consumers, and one each 
for management and the designers. Because of the 
amount of time involved in rating the designs for 
variable, it was not considered desirable to 
request the management and design groups to rate 
the designs on more than one variable only. The 
ostensible purpose of asking management and the 
designers to complete the ratings was primarily 
as a device to explain to them the method em 
ploved 


each 


RESULTS 


Each design was assigned an overall rat- 
ing for each variable which was the mean 
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score for the group evaluating the designs, 
and the mean scores were converted into 
ranks for each variable. To measure the ex- 
tent of agreement and disagreement among 
the four groups of raters, the Spearman 
rank-difference correlations among the eight 
variables were calculated. 

The correlations for the gift ratings be- 
tween the men and women and _ between 
management and the designers were as fol- 
lows: .58 between the men and women, and 
.55 between management and the designers. 
From this it can be seen that there was fair 
agreement between management and the de- 
signers as to which packages they believed 
women would be more likely to prefer as 
gifts for their husbands. There was also 
fair agreement between the men and women 
as to which designs they would like to give 
and receive. Both the male and female con- 
sumers, however, were in substantial dis- 
agreement with the other two groups on this 
point. The correlations between the con- 
sumers vs. the management and designer 
— .48 between the de- 
and the women, — .14 
designers and the men, 


groups were as follows: 
signers between the 
.21 between 
agement and the women, and 


man- 
42 between 
management and the men. The reasons under- 
lying this disagreement can be understood in 
terms of the matrix of intercorrelations among 
all eight variables as shown in Table 1. 
Examination of this correlation matrix re- 
veals two clearly defined clusters. The first 


cluster is composed of the gift ratings of 
management and of the designers, and of 
the masculinity ratings of the male and fe- 
male consumers. The second cluster is com- 
posed of the gift and the expensiveness rat- 
ings of the consumers. The two clusters cor- 
relate negatively with each other. The one 
exception is the low positive correlation 
(.23) between the gift and masculinity rat- 
ings of the male consumers. 

The reason for this disagreement, between 
the consumers on the one hand and man- 
agement and the designers on the other, 
stems from the fact that these two groups 
were apparently using conflicting criteria in 
evaluating the designs. Management and the 
designers were evaluating the designs in 
terms of what the consumer perceived to be 
masculinity. Those designs which were per- 
ceived as being more masculine tended to 
be the same which the designer and 
management groups thought the consumers 
would prefer. The ratings of the consumers, 


ones 


on the other hand, tended to vary as a func- 
tion of what they considered to be the ex- 
pensive appearance of the design. 

In this particular case expensiveness and 
masculinity appear to be relatively incom- 
patible criteria, the correlation between them 
being — .73 among the and — .47 
men. groups of 


women, 


among the Since the two 


raters tended to use one of these criteria to 
the relative exclusion of the other, the gift 
the consumers tended to 


ratings of corre- 


TABLE 1 


RANK DIFFERENCE 


INTERCORRELATIONS 


AMONG THE E1Gut VARIABLES 


N = 18 designs 


Variable 


Masculinity-men 
Masculinity-women 


1 
2 
3. Gift-designers 
4 


Gift-management 


Gift-men 

Gift-women 
Expensiveness-men 
Expensiveness- women 
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late negatively with the ratings of the client’s 
management and of the designers who created 
the designs. This is not to say that mascu- 
linity was completely unimportant among 
the consumer samples. It is to say that of 
the two variables, masculinity and expensive- 
ness, expensiveness was the more important. 
Actually, among the sample of males, mascu- 
linity assumes considerable importance when 
the effects of perceived expensiveness are 
partialed out or eliminated. The partial cor- 
relations between the gift ratings and the 
masculinity ratings, when expensiveness is 
partialed out, is: .58 for the men, and — .13 
for the women. The inference to be drawn 
here is that masculinity does contribute to 
preference on the part of the men when the 
effects of perceived expensiveness are elimi- 
nated. In the case of the women, masculinity 
appears to play no role at all in contributing 
to preference. 


DISCUSSION 


The marketing implications of these find- 
ings are clear. Had the packaging decision 
been made on the basis of the recommenda- 
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tion of the design firm and on the pooled 
judgment of the client’s marketing manage- 
ment, the net effect would have been to 
select designs which would have had the 
least appeal so far as the consumers sam- 
pled were concerned. 

The result of the research was to outline 
specifications for the design group which 
would enable them to modify certain of the 
designs in ways which would cause them 
to be perceived by the consumer both as 
masculine as well as expensive. 

These findings point up the contribution 
which consumer research can make to the 
company involved in new packaging plans. 
Without the kind of information which con- 
sumer research can provide, management de- 
cisions concerning new package development 
remain much more of a gamble than most 
manufacturers can afford. 
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Studies of automated teaching or programed 
instruction (PI) in schools, colleges, and the 
Armed Forces (Lumsdaine & Glaser, 1960) 
have shown that this technique has consider- 
able promise in terms of reducing training 
time and teaching more effectively. At the 
time of writing this article, no comparable 
studies had been reported on the use of PI 
with industrial employees. Because of the im- 
plications of PI for industrial training pro- 
grams, a research project was undertaken to 
evaluate its feasibility and effectiveness in an 
industrial training situation by means of ex- 
periments at technical employee training cen- 
ters of a large company. This article will de- 
scribe the first experiment completed under 
this project, which compared the learning 
achievement of employee classes taught by 
PI in the form of programed textbooks with 
that of classes taught by conventional class- 
room instruction. The reactions of the ex- 
perimental classes to the use of PI were also 
obtained. 


PROCEDURI 


In March 1960, a team composed of a train- 
ing center instructor and a psychologist was 
formed to prepare programed textbooks for 
the introductory section of a 16-week course 
on the IBM 7070 Data Processing System 
given to computer service men at a company 
training center. 

By September, five programed textbooks 
containing 719 frames were completed. These 
frames covered the first 15 hours of conven- 
tional classroom presentation, This amount of 
class time would be equivalent to 5 weeks of 
a 3-hour. college course. The topics covered 
were the names and functions of units of the 
7070, bit coding, data flow, types of com- 
puter words, and the program step. To test 
the effectiveness of PI in teaching this type 
of material, the following experiment was 
designed: 


Two classes (n = 42) which reported to the train- 
ing center during September 1960, were designated 
the control classes. They were taught the introduc- 
tory material of the course by two different instruc- 
tors using the conventional classroom method 
ture-discussion). This instruction a period 
of four mornings and totaled 15 hours, 3 hours on 
the first morning and 4 hours on each of the remain- 
ing three mornings. The afternoons of each day were 
spent on another phase of 7070 training. On the fifth 
morning, these classes were administered a compre- 
hensive 2-hour achievement test consisting of 88 com- 
pletion and multiple-choice items. This test was pre- 
pared by the program writing with the co- 
operation of training instructors. A 
new test was necessary because no satisfactory ob- 
jective test of sufficient length was available for the 
part of the course taught by PI 

Six classes (n made up the experimental 
these classes reported for training 
from October through 1960 
They were instructed solely by means of programed 
textbooks, which substituted for the 
and discussions of the introductory part of the course 

The classroom time allotted for programed texts 
was reduced to 11 hours spread over a 3-day period, 
with 3 hours on the first day 


(lec- 
covered 


team 


several center 


Two of 
each month 


group 
December 


were lectures 


and 4 hours on each 
of the last 2 days. This reduction in classroom pres- 
entation time was based on fairly conservative esti 
mates of the time needed for the trainees to com- 
plete the programed texts. The trainees were also 
permitted to take the programed texts home with 
them for evening study 

The class instructors were directed to act as if the 
programed textbooks were part of the regular class 
room procedure in order to minimize any possible 
Hawthorne effect. It was mentioned to the 
students that they were participating in an experi- 
ment. The instructors confined their role to stating 
at the beginning of the first class period that this 
section of the course would be taught by five self- 
explanatory programed textbooks 
out the first programed text. The 
texts were passed out 
and third days of the 
second and 


neve! 


They then passed 
third and fifth 
at the beginning of the second 
experiment, respectively. The 
fourth texts were given to the trainees 
during the first and second classroom periods, re 
had finished the texts passed 
out at the beginning of the period 


spectively, after they 


The reason tor deliberately pacing the completion 
of the five programed texts over the 3-day period in 
this manner was to assure better administrative con 
trol. This experimental design, however, prevented 
the faster students from finishing all of the texts be- 
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fore the third day, and did not permit the direct 
measurement of the full saving in presentation time 
possible under PI. 

After passing out the texts at the beginning of 
each class period, the instructors retired to the back 
of the classroom and confined their activities to re- 
cording the number of frames that each trainee com- 
pleted in class. They were also instructed to answer 
as briefly as possible the questions asked by trainees. 
A record was kept of all questions asked. 

The experimental classes also took the same com- 
prehensive achievement test on the day following the 
completion of their instruction. In addition, they 
anonymously completed a Student Questionnaire ask- 
ing them to evaluate PI. The questionnaire consisted 
of five items with five-point descriptive scales meas- 
uring the effectiveness, difficulty, and acceptability of 
PI, and three open-ended questions asking for any 
general comments and any aspects of PI particularly 
liked or disliked. 

The control and experimental groups were run con- 
secutively rather than concurrently in order to reduce 
any contamination of results. Since members of both 
classes starting each month at the training center 
might come from the same company field office and 
might also room together, it was decided to elimi- 
nate the possibility that study materials would be ex 
changed by control and experimental trainees during 
evening study periods 

To avoid interference with the administration of 
the company training center, no attempt was made 
to assign trainees to class by random procedures. In- 
stead, men were assigned to classes as they were re- 
ported available for training by their office managers 
in the field. In planning the experiment, it was 
anticipated that analysis of covariance procedures 
would make it possible to control on background 
variables which differed for the control and experi 
mental groups and were correlated with achievement 
test scores 

In order to test the comparability of the control 
and experimental groups on various background data, 
such as age, educational level, total months of ex- 
perience, and previous computer experience, data were 
collected by means of an Education and Experience 
Questionnaire. It should be noted that these groups 
generally consisted of well-selected, highly motivated 
men who had originally been carefully screened for 
employment and who had satisfactory work records 
with the company. A company developed test of rea- 
soning ability—Programer Aptitude Test (PAT)— 
was also administered. The significance of differences 
on these variables for the control and experimental 
groups and correlations of these variables with 
achievement test scores were calculated. 


RESULTS 


The subject matter covered in these experi- 
ments took 15 hours of classroom time to pre- 
sent by the conventional lecture-discussion 
method. The same information was covered 
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in 11 hours by programed textbooks, a sav- 
ing of 4 hours or 27% in classroom presenta- 
tion time. 

In response to an item on the Student Ques- 
tionnaire, 60° of the experimental class re- 
ported that PI required less home study than 
the conventional classroom method. Twenty- 
four percent reported spending the same 
amount of home study under both methods, 
and 16° stated that PI required more home 
study. These results indicated that the total 
reduction in study time achieved by the use 
of programed texts was actually more for most 
of the students than indicated in this experi- 
ment, which measured only the reduction in 
classroom presentation time. 

It should also be remembered that the 
amount of classroom presentation time for the 
experimental group was arbitrarily fixed to 
effect a conservative savings in classroom 
time. Because the programed textbooks were 
taken out of class by the trainees, records of 
the actual time needed for completing the five 
texts could not be maintained. From the in- 
structors’ records of the number of frames 
that each trainee completed in class, however. 
it was possible to derive some estimates of 
individual differences in the time required to 
complete the program. A mean completion 
time per frame was calculated for each trainee. 
On the basis of these figures, the mean com- 
pletion time per frame for the entire group 
was calculated to be 49 seconds and the stand- 
ard deviation, 9 seconds. For the total 719- 
frame program, it was therefore estimated 
that the mean completion time was 9.8 hours 
and the standard deviation, 1.8 hours. Indi- 
vidual differences in estimated completion 
time ranged from 7.2 to 15.3 hours. Thus, 
the mean completion time was 1.2 hours less 
than the 11 classroom hours allotted for PI 
in this experiment, and there were large in- 
dividual differences in completion times. This 
finding suggested that even greater savings in 
instruction time would be possible for most 
trainees if they used instruction on an indi- 
vidual basis. Because of the experimental de- 
sign used, these savings could not be directly 
measured in the present experiment. 

A comparison of the aptitude test scores 
and background variables for the control and 
experimental groups and their correlations 
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with the achievement test scores are given in 
Table 1. Of all the background variables, only 
the PAT scores showed a significant difference 
between the two groups and had a significant 
relationship with achievement test scores. The 
hypothesis that both groups were drawn from 
the same population on reasoning ability was 
rejected at the .05 level. The hypothesis of no 
relationship between reasoning ability and 
achievement test scores was rejected at the 
05 and .01 levels for the control and experi- 
mental groups, respectively. Analysis of co- 
variance was used to test the significance of 
differences in residual achievement test scores 
after eliminating the effect of PAT scores on 
achievement. 

Table 2 shows the results obtained from the 
analysis of covariance (Walker & Lev, 1953). 
The null hypothesis of no difference between 
the control and experimental group regression 
slopes was accepted (F = .107). The null hy- 
pothesis of no differences in residual achieve- 
ment test scores between experimental and 
control groups was rejected by F test at the 
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.01 level of confidence (F = 50.39). Thus, the 
obtained differences in achievement test scores 
could not be wholly attributed to differences 
in aptitude test scores (PAT). 

On the achievement test scores, the control 
group had a mean of 86.2 and a standard de- 
viation of 7.4. The experimental group had a 
mean of 95.1 and a standard deviation of 4.0. 
When the achievement test scores were ad- 
justed for the effect of PAT test scores, the 
control and experimental group means _be- 
came 86.9 and 94.7, respectively (Table 2). 
The difference in adjusted means was 7.8, 
only slightly less than the difference of 8.9 in 
the unadjusted means. 

The standard deviations of the adjusted 
achievement test scores for the control and 
experimental groups were 7.0 and 3.8, respec- 
tively. An F test of homogeneity of variance 
rejected the hypothesis at the .02 level that 
both samples were from populations with the 
same variance. Thus, the difference in ad- 
justed achievement test means between the 
two groups could have been accounted for by 


TABLE 1 


COMPARISON OF CONTROL 


AND APTITUDE 


Variable 


Education 


attended college 


Total months 


experience 


Percent with 
previous computer 


experience 


Programer 


Aptitude Test 


cant at leve 
>and as scal 
** Significant at the .02 level by 
*** Significant at the .01 level by ¢ 


AND EXPERIMENTAL 


Groves ON BACKGROUND 


Test VARIABLES 


r with Achievement 
Test Score 


Experimental 


(n = 70 Experimental 
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TABLE 2 
ANALYSIS OF COVARIANCE OF ApTiTUDE Test (X) AND ACHIEVEMENT Test (1}’) SCORES 
FOR CONTROL AND EXPERIMENTAL GROUPS 


Sums of Squares and Sums of Cross-Products: 


Within Between Total 


Control Group 


Experimental Group 


(n = 42) (n = 70) Groups Groups (n = 112) 
2x? 6062 5008 11070 1280 12350 
mary 1165 786 1951 1645 3596 


3399 


Source of Variation 


Between adj. group means 1411 1 1411 


Within common slope 3055 109 28 50.39* 
Between slopes 3 1 3 
Within slopes 3052 108 28 107 


Total 44066 110 


Adjustment of Achievement Test Means: 


Observed Mean 
Adjusted Y Mean 


Group Xi Y; — b.(X; — 
Control 51.2 86.2 86.9 
Experimental 58.2 95.1 94.7 


Total 


@ be =common within-groups slope =1,951/11,070 =.176. 
* Significant at .01 level of confidence. 


a difference in variance between the groups TABLE 3 


(Edwards, 1950). It was also noted that the pyisrrisutions or ApjusTED ACHIEVEMENT TEST 

difference in achievement variance was paral- ScorEs FOR CONTROL AND EXPERIMENTAL 

leled by a difference in reasoning ability vari- Groups 

ance as measured by the PAT (Table 1). oe — 
In order to remove the possible effect of Experi- 

the initial difference in reasoning ability vari- Control mental 

ance on achievement variance, control and ex- Pata. (n = 10) 


perimental groups matched for PAT scores Adjusted Achieve- 


were set up. This resulted in reducing the  ™ent Score Level 
number in each group to 34 cases. The 
achievement test means and standard devia- gq o4 14 33 15 2? 
tions for these matched samples were 86.7 and gs _go9 9 22 5 7 

7.3 for the control group, and 93.9 and 4.6 go-g4 7 17 3 4 
for the experimental group. The differences in 75-79 5 12 

means and standard deviations were found to 70-74 1 2 

be significant at the .01 and .02 levels, re- 65-69 l : 


spectively, by ¢ test for matched groups. — ygean 
Therefore, the higher mean and lower vari- — standard Deviation 70 38 
ance in achievement for the experimental 
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group did not appear to be due either to dif- 
ferences in reasoning ability level or vari- 
ability, but rather to the different teaching 
method used. 

Distributions of the adjusted achievement 
test scores for the control and experimental 
groups are given in Table 3. The distribution 
for the experimental group indicates a con- 
centration of scores at the upper score levels. 
If a score of 95 or above is adopted as an in- 
dication of mastery of the subject matter 
taught, it can be seen that the experimental 
group had 67% at this level or above, com- 
pared to only 12°% for the control group. The 
PI group thus had more than five times as 
many trainees at the highest achievement 
level. 

On the Student Questionnaire administered 
anonymously to the six experimental classes, 
the replies of the trainees were very favorable 
to PI (Table 4). Of the total group of 70 
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men, 87‘% liked PI more than conventional 
instruction, and 83° said they would prefer 
using it in future IBM courses. Only 6% 
liked PI less than conventional instruction, 
and 13% would have some objections to using 
it in future courses. A possible reason for the 
size of the latter negative response was the 
impression of some students that PI would 
completely replace the use of instructors and 
class discussions in future courses. 

It was interesting to note that practically 
all of the trainees realized the advantages of 
PI over conventional instruction. All of the 
group (1007) stated before taking the ex- 
amination that PI was more effective than 
conventional instruction, and 93% also found 
it less difficult. None of the trainees found 
PI more difficult than the present instruction 
method. 

The Student Questionnaire also provided 
the trainees with several open-ended questions 


TABLE 4 


SUMMARY OF STUDENT QUESTIONNAIRE RESPONSES FOR EXPERIMENTAL CLASSES 


(n 


PI 
Much 


Less 


‘ompared to the regular classroom 
instruction in other company 
courses you have taken: 


How do you like the programed 


instruction (PI) method? 3% 
. How difficult was it to learn 
using the programed instruction 


(PI) method? 


3. How much home study does the 
programed instruction (PI) 
method require? 


. How well has the programed in- 
struction (PI) method taught 
you the material covered? 


Strongly 
Object 
5. In future company courses you 
may take, would you like to see 
the programed instruction (PI) 
method used in place of the 
regular classroom method ? 


® For each question, the form had a five-point descriptive scale containing very unfavorable to very fav 
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asking what they particularly liked or disliked 
about PI. In their responses, 69 of the 70 
trainees mentioned some aspect of PI which 
they liked. A content analysis indicated that 
the most frequently liked aspects of PI were 
its effectiveness as an instruction method (46 
comments); certain characteristics of the 
method itself, such as the repetition of im- 
portant points, the gradual and logical se- 
quence of presentation, the way it maintained 
the student’s attention and concentration (23 
comments); and the ability to proceed at 
one’s individual rate (10 comments). It ap- 
peared from these comments that, through 
their own experience with PI, the trainees 
themselves recognized a number of the ad- 
vantages usually ascribed to it. 

In response to the question on what they 
particularly disliked, 40 of the 70 trainees 
wrote in a number of comments, but no single 
comment was made by many individuals. For 
example, there was criticism of the need to 
turn pages constantly (8 comments), the 
amount of repetition and written responses 
required (6 comments), the amount of time 
allotted for studying the materials (7 com- 
ments), and the absence of an instructor and 
class discussion (5 comments). Another criti- 
cism made by 7 trainees was the failure of 
the PI textbook used in this experiment to 
provide adequate summaries or outlines of the 
topics covered to aid in reviewing the material. 

Forty-nine trainees responded to another 
question asking for additional comments, but 
most of these remarks merely amplified the 
positive or negative comments reported above. 
Of most interest were the 14 comments rec- 
ommending the use of PI in other courses. Of 
these, however, 8 trainees qualified this rec- 
ommendation by stating that PI should not 
be used for extended periods without some 
type of instructor contact or classroom dis- 
cussion. 

DiscussION 


The results of this experiment using PI in 
an industrial training situation corroborated 
the positive findings found by other investi- 
gators in studies of PI in schools, colleges, 
and the Armed Forces (Lumsdaine & Glaser, 
1960). They indicated the reduction in train- 
ing time and the improvement in learning 
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achievement possible through the use 
in industrial training. 


of PI 


These findings suggested several applica- 
tions to industrial training programs which 
promise important economies. One is the re- 
duction in the number of days that employees 
need to spend at central company training 
centers learning a given course. This reduction 
can be translated immediately into savings in 
the direct daily living expenses and salaries of 
these trainees and eventually into reductions 
of other educational and administrative costs. 

A second application is the possibility of 
greater decentralization of training by en- 
abling employees to be trained in basic courses 
at local field offices or other locations rather 
than in a central company training center. 
Since the trainee works individually on PI 
materials, an educational package can be pre- 
pared for distribution to these field locations. 
The possible economies in this method of in- 
struction can be easily seen. 

In addition to savings in training costs, an- 
other promising result of PI is the possibility 
of better trained employees. At present, there 
appears to be no reason why PI cannot be ap- 
plied to substantial portions of technical, 
manufacturing, clerical, sales, and manage- 
ment training courses now given to company 
employees. Although the effect of better 
trained personnel cannot always be measured 
directly, it is obviously a major factor in im- 
proving industrial efficiency. 

Some important qualifications regarding the 
use of PI for industrial training are suggested, 
however, by the analysis of trainee responses 
to the Student Questionnaire (Table 4). 
While these responses were generally very 
favorable, the write-in comments on particu- 
larly disliked aspects of PI suggested a num- 
ber of areas where potential trainee dissatis- 
faction could impair the effectiveness of a 
training program using PI. These comments 
concerned the frequency of page turning, the 
boredom of too much repetition and writing-in 
of responses, and the feeling of not having 
enough time to go through the programed 
textbook. Although these comments were made 
relatively infrequently in this experiment, the 
areas mentioned must be kept in mind in 
planning to use PI in industrial training. 
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Fortunately, much of the page turning can 
be eliminated by improvements already un- 
der way in programed text format, rote repe- 
tition can be minimized by the preparation 
of more stimulating programs, and reasonable 
time limits can be determined by preliminary 
tryouts of programed materials. 

Further trainee dissatisfaction with PI 
could arise from failure to integrate it prop- 
erly with other instructional techniques. It 
must be remembered that PI is not a panacea 
for all training ills. While 87°° of the trainees 
in this experiment expressed a liking for PI, 
there were comments that too much PI with- 
out breaks for class discussion, laboratory, or 
other instructor contact at intervals would, in 
their opinion, become boring. Anyone con- 
cerned with using PI in industrial training 
must therefore carefully plan how to use it in 
limited amounts to supplement existing edu- 
cational procedures rather than to replace 
them completely. It is anticipated that future 
research in PI will furnish suggestions on how 
this may best be done. 


SUMMARY 


Programed textbooks containing 719 frames 
were prepared covering the introductory 15 
hours of a 16-week course for trainees in a 
7070 Data Processing System servicing course. 
Achievement test scores for six experimental 
classes (n = 70) who used these programed 
texts were compared with those of two control 
classes (n = 42) taught by the lecture-discus- 
sion method. Significant gains in achievement 
and reduction of training time were found for 
the experimental classes. Student reaction to 
programed instruction as measured by a ques- 
tionnaire was found to be favorable. 
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The importance of job perceptions in the 
management hierarchy was discussed in a 
previous paper (Porter, 1961), and results 
were presented on differences in amount and 
type of psychological need satisfactions per- 
ceived as being received in bottom and middle 
management jobs. The present study is con- 
cerned with a different aspect of job percep- 
tions, namely, personality traits that are 
perceived to be important for success in 
particular management jobs. Knowledge of 
such perceptions should be relevant for 
understanding the factors that affect indi- 
viduals’ motivation and performance on the 
job. Ultimately, top management decisions 
with regard to the promotion and placement 
of individuals in lower management positions 
should benefit from this type of information. 

Self-descriptions of individuals in various 
levels of management, obtained by means of 
forced-choice, paired-adjective checklists, have 
been examined previously in a series of stud- 
ies (Porter, 1958, 1959; Porter & Ghiselli, 
1957). The results of these self-description 
studies have shown certain differences be- 
tween levels of management that may in 
part reflect differences both in frequency 
of personality types found in the various 
levels, and in role demands of these levels. 
The present study was designed to produce 
more direct evidence on role demands of 
management jobs as seen by the job incum- 
bents themselves. In the previous self-descrip- 
tion or self-perception studies, the respondent 
was asked merely to describe himself—no 
attention was called to any of his particular 


1This study was begun as part of the research 
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the University of California, Berkeley. It was con- 
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Foundation. 
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roles (e.g., male, father, plant superintendent, 
young adult, foreman, etc.). In the present 
investigation, the individual’s specific man- 
agement job is the basis on which he makes 
his judgments regarding trait requirements. 
As in previous studies, the respondent does 
not know the types of categories or classes 
of jobs that are under investigation, or the 
comparisons to be made (e.g., bottom vs. 
middle management jobs, staff vs. line jobs, 
etc.), and therefore systematic distortions of 
perceptions by categories or groups of indi- 
viduals are unlikely. 

The present study specifically investigated 
differences between bottom level managers 
and middle level managers in the perception 
of the relative importance of 13 personality 
traits for success in their respective manage- 
ment jobs. The study also investigated the 
relative perceived importance of the traits 
within each of the two management levels. 


METHOD 
Questionnaire 


The data presented in this study were obtained 
from one section of a three-part questionnaire. This 
section consisted of 13 personality traits (see Table 1 
for a list of these traits) arranged in 78 forced- 
choice pairs. Each trait was paired once with every 
other trait so that the 78 pairs constituted a complete 
paired-comparison matrix. The respondents were 
instructed as follows: 

“The purpose (of this part) is to obtain a picture 
of the traits you believe would best qualify a person 
for your present management position. There are no 
right or wrong answers. In each pair of words, check 
the one you think is relatively more important for 
success in your present management position. Al- 
though specific words will be repeated, no pair of 
words will be duplicated. Make each choice a 
separate and independent judgment, and do not 
omit any pair.” 


Sam ple 


Details concerning the sample of respondents have 
been presented elsewhere (Porter, 1961). The sample 
consisted of 64 bottom management individuals 
(first-level supervisors and foremen) and 76 middle 
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management personnel (above first-level supervisors 
but below vice-presidents or major department 
heads). The sample was obtained from three compa 
nies, called Companies A, B, and C in this paper 
and in the previous study that utilized the same 
sample.2 Company A is a large, nationwide manu- 
facturing organization, of which one plant was 
sampled for this study. Company B, also a nation- 
wide concern, produces and distributes a_ food 
product, with a relatively large number of its jobs 
connected with the selling and distributing functions; 
one geographical division of this company provided 
respondents for the study. Company C is a medium- 
sized utility firm, with the respondents from this 
company coming from two of its divisions. Previ- 
ously published details of the sample (Porter, 1961) 
showed that the bottom and middle management 
groups of respondents were similar in median age 
and seniority. The middle management group had a 
somewhat greater proportion of individuals who had 
had some education beyond high school. 


Procedure 

All questionnaires were distributed individually to 
the respondents, through either United States or 
company mail. Each questionnaire was accompanied 
by a company memorandum requesting the indi- 
vidual’s cooperation in the university-sponsored re- 
search project. The potential respondent was in- 
formed in this memorandum that his replies would 
be held confidential and that no individual responses 
would be made available to the company. Respond- 
ents returned completed questionnaires in prestamped 
self-addressed envelopes to the investigator. 


RESULTS 


The results of this study are presented in 
three tables. Table 1 compares bottom level 
management individuals from all three com- 
panies combined, with middle management 
individuals from all three companies. Table 2 
compares all individuals in each company, re- 
gardless of management level, with those in 
each of the other two companies. Table 3 
presents a breakdown of results for the two 
management levels within each of the three 
companies. In all three tables, ranks and 
mean scores are presented for each of the 13 
personality traits. A mean score is based upon 
the number of times the trait is selected in its 
12 comparisons with the other traits; thus, 
mean scores can vary from 0-12, with an 
over-all mean of 6 for the total group of 13 
traits. 

? The author wishes to thank the companies that 
agreed to participate in this study, and their man- 


agement personnel who supplied the basic data. 


TABLE 1 
MEAN ScorES AND RANKS FOR TRAITS 


BY MANAGEMENT LEVELS 


Bottom Middle 
Management Management 
(V = 64) (N = 76) 


Mean 
Sc ore 


Mean 


Trait Score Rank Rank 


Aggressive 8 6.34 
3.21 
913 
1.42 
7.45 
7.08 
2.53 
9 OR 
6.04 
6.47 
5.46 
7.68 
6.11 


Conforming 


Cooperative 
Dominant 


_ 


Energetic 
Flexible 


Independent 


NN we — 


Intelligent 
Original 
Persevering 
Poised 
Self-Controlled 
Sociable 


Table 1 shows that there was a very high 
correlation (rho = .97) between the ranks 
(and mean scores) of the traits as selected by 
bottom management and those as selected by 
middle management. 

The other important result emerging from 
Table 1 was the relative position of adjectives 
indicating cooperativeness and a willingness 
to adjust to other individuals (Conforming, 
Cooperative, Flexible, and Sociable), in com- 
parison with those indicating independence 
and individuality (Agressive, Dominant, In- 
dependent, and Original). Within both man- 
agement groups the cooperative-type adjec- 
tives were on the average considerably higher 
ranked in perceived importance to success on 
the job than were the items depicting “rugged 
individualism.” As can be seen in Table 1, the 
trait of Cooperative even outranked and out- 
scored Intelligent; this was true for both 
levels of management, but it was especially so 
for individuals in bottom management ‘jobs. 

In Table 2, where the data are presented 
by companies rather than by management 
levels, there were again high rank-order cor- 
relations of traits from company to company. 
However, on two items there were major 
shifts in ranks from company to company. 
For the Aggressive trait, a rank of fourth in 
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importance was obtained from Company B, a 
company in which a number of the managers 
in the obtained sample had sales supervisory 
duties. In Company A, the manufacturing 
concern, Aggressive had a middle rank of 7, 
and in the utility company it had a relatively 
low rank of 10. The other item that fluctuated 
from company to company was Self-Con- 
trolled, which had a high rank with Com- 
panies A and C, and only a middle rank with 
Company B. The variations in the placement 
of these two items were in accordance with 
expectations based on other knowledge of the 
companies involved. As previously mentioned, 
Company B had a number of sales managerial 
positions in its sample, and therefore it was 
not surprising that individuals in that com- 
pany stressed Aggressive as relatively impor- 
tant and Self-Controlled as only moderately 
essential. Company C, on the other hand, is a 
utility company with a more traditional for- 
mal organization and operating as a noncom- 
petitive economic enterprise. The individuals 
from Company C tended to place Aggressive 
considerably lower than did those in Com- 
pany B, and they also put relatively more 
emphasis on Self-Controlled. Company A man- 
agers, working in a manufacturing organiza- 
tion in a competitive industry, but operating 
under a traditional formal organization setup 


TABLE 2 
MEAN SCORES AND RANKS FOR TRAITS, 
BY COMPANIES 
Company A Company B Company C 
(VN =40) (N= 53) (N = 47) 


Mean Mean Mean 
Score Rank Score Rank Score Rank 


Trait 


5.04 10 
443 il 
9.87 1 
160 13 
6.66 
6.79 
2.68 
9.49 
6.47 
§.51 
7.87 
6.38 


6.42 7 By 
9.95 1 j 

1.10 13 13 
7.40 

7.08 

2.28 

8.38 

5.45 

6.20 

4.95 

8.42 

6.60 


Aggressive 
Conforming 


Cooperative 
Dominant 
Energetic 
Flexible 
Independent 
Intelligent 
Original 
Persevering 
Poised 
Self-Controlled 
Sociable 
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with rigid lines of authority and responsibil- 
ity, might be expected to have job perceptions 
somewhat in between those of Company B 
and Company C managers on the items 
Aggressive and Self-Controlled. This in fact 
was true for Aggressive, but not for Self- 
Controlled, which was ranked quite similarly 
by Company A and Company C managers. 
(Company A might also have been more 
similar to Company C on Aggressive had the 
N for the first-level managers in the former 
company been as large as that in the latter.) 
Thus, although Companies A and B both 
operate in competitive private industry, Com- 
pany A tends to be more similar to a non- 
competitive organization, Company C. This 
suggests that immediate job functions and the 
type of formal organizational structure are 
the more crucial factors in the determination 
of the types of job perceptions studied here. 

The shifts in ranks for the two items of 
Aggressive and Self-Controlled in Table 2 not 
only fit expectations based on knowledge of 
company organizational setups and functions, 
but they also demonstrate that the type of 
questionnaire instrument used in this study 
was sensitive enough to pick up differences in 
perceptions between different populations of 
respondents. 

Again, in Table 2, as in Table 1, the co- 
operative-type traits were higher ranked on 
the average in all three companies than were 
the traits concerned with individuality and 
independence. 

Table 3, which presents a breakdown of 
the data by the two management levels within 
each of the three companies, shows that the 
two levels were generally quite similar to each 
other within each of the three different or- 
ganizations. However, Table 3 reveals one 
interesting finding that was not clearly ap- 
parent in Tables 1 and 2. If the four “co- 
operative” adjectives are compared on mean 
scores with the four “independent” adjectives 
as described earlier in this section, it can be 
seen that in Companies A and C there was a 
fairly consistent trend for middle manage- 
ment individuals to have perceived the “co- 
operative” traits as relatively less important 
than did the bottom management individuals, 
and to have perceived the “individualistic” 
traits as relatively more important. (In both 
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TABLE 3 


MEAN ScorES AND RANKS FOR TRAITS, BY MANAGEMENT LEVELS WITHIN COMPANIES 


Company A 


Bottom 
Management 
= 16) 


Middle 
Management 


(N = 24) 


Mean 
Score 


Mean 
Score 


Mean 
Trait Rank Score Rank 
6.00 
5.56 
10.69 
O.88 
6.81 
6.25 
1.69 
8.69 
4.38 
Persevering 5.81 
Poised 5.12 
Self-Controlled 8.69 
Sc wciable 7.44 


6.71 6 
2.58 12 
9.46 1 
1.25 13 
7.79 
7.62 
2.67 
8.17 
6.17 
6.46 
4.83 
8.25 
6.04 


Aggressive 7.23 
3.96 
8.81 
2.23 
8.00 
6.58 
2.35 
8.46 
6.15 
6.62 
5.65 
5.92 


6.04 


Conforming 
Cooperat ive 
Dominant 
Energetic 
Flexible 
Independent 
Intelligent 
Original 


of these companies, nevertheless, the “coop- 
erative” adjectives received definitely higher 
values than did the “individualistic” adjec- 
tives.) Since the groupings of these two sets 
of adjectives were accomplished post hoc 
rather than prior to the experiment, no mean- 
ingful statistical treatment can be applied to 
this apparent trend, and thus it remains sug- 
gestive rather than proven. It should also be 
noted that in Company B, which emphasizes 
sales and distributive functions, the two man- 
agement levels were about equal in their rela- 
tive emphasis on “cooperative” and “inde- 
pendent” traits. 


DISCUSSION 


Two major findings emerge from this 
study. The first is the fact that there was 
little difference between bottom level and 
middle level managers in how they ranked the 
13 common personality traits in terms of per- 
ceived importance for success in their respec- 
tive jobs. This finding applies to the relative 
perceived importance among these 13 traits. 
If the respondents had been asked to make 
absolute ratings of these traits, rather than to 
give them relative ranks via the forced-choice 
method, it is possible that significant differ- 


Bottom 
Management 


(N= 


Company B Company C 


Middle 
Management 


Bottom 
Management 
(N = 22) 


Middle 
Management 
26) (N = 25) 
Mean 
Score 


Mean 


Rank Score Score 


Rank 


Rank 
5.04 
3.88 
9.52 
1.48 
6.64 
6.72 
3.04 
9.44 
5.52 
6.60 
5.60 
8.16 
6.36 


4 7.22 
11 3.15 
8.48 
1.52 
7.89 
6.93 
1.93 
9.56 
6.37 
5.89 
6.74 
5.93 


5.05 
5.05 
10.27 
1.73 
6.68 
6.86 
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ences between management levels might have 
been produced. Such differences in absolute 
perceived importance could not be determined 
from the present instrument. 

It is also possible, of course, that some of 
the similarity in rankings between the two 
management groups was due to general social 
or personal desirability differences among the 
traits, with such differences having little to 
do with specific job requirements. However, 
other evidence from the present study indi- 
cated that general social desirability could not 
entirely account for the obtained similarity : 
in Table 2, where results were presented by 
companies rather than by management levels, 
the ranks of some of the traits shifted rather 
widely from company to company in accord- 
ance with the probable psychological demands 
of managerial jobs in those companies. Thus, 
when the respondents were grouped on other 
bases than management levels, the obtained 
ranks seemed to reflect particular organiza- 
tional conditions and were not totally the re- 
sult of some general differences in social de- 
sirability of the items. 

The second major finding from this study 
involves the relatively high ranks obtained for 
the traits showing a concern for adapting to 
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the feelings and behavior of others—the co- 
operative-type items—compared with the rel- 
atively low ranks for traits showing a strong 
emphasis on personal and individual capabil- 
ities—the independent-type items. The former 
cooperative-adaptable items consistently were 
ranked higher than the latter individualistic- 
independent items, whether the analysis was 
by management level or by company. Again, 
this finding could be due to differences in the 
general social desirability of the particular 
items; however, in our present day industrial 
society where great emphasis has been placed 
on “individual worth,” etc., items such as 
Independent and Original hardly seem less 
personally desirable for individuals than items 
such as Cooperative, and Flexible. The results 
of the present study would seem to suggest 
that the cooperative-type traits definitely are 
perceived as more important for success in 
lower and middle management positions in 
business than are independent-type traits. 
There is, of course, no evidence in the present 
study to determine whether the perceptions 
accurately represent reality. To the extent 
that they do, however, they raise a problem 
as to the fate of Original, Dominant, Inde- 
pendent individuals in lower managerial levels 
in large organizations. It would seem that 
either these individuals would have to con- 
form in their behavior to their perceptions of 
the type of person who gains success in their 
positions, or else they would probably have to 
forego as rapid advancement up the organ- 
izational ladder as individuals who fit (either 
naturally or by effort) the successful stereo- 
type. In either case, the organization would 
probably suffer the loss of some degree of 
originality and independence in its future top 
echelon executives. This would seem to be an 
undesirable occurrence from the organiza- 
tion’s point of view, if the statements of many 
presidents and other high-ranking corporation 
executives can be taken at face value concern- 
ing the necessity for managers to show initia- 
tive, self-reliance, and creativity in dealing 
with company problems. In other words, it 
appears that many top-level executives may 
be advocating one type of behavior, but re- 
warding through a “law of effect” mechanism 


quite another type of behavior. If it can be 
assumed that the higher the individual is in 
the organization the greater such behavior as 
originality and independence is demanded by 
the job requirements, the question then be- 
comes one of how organizations insure that 
individuals who are best suited in these types 
of traits will be the ones that are likely to 
advance to top management positions. 


SUMMARY 


This study investigated the perception of 
the relative importance of various personality 
traits for success in management jobs. The 
perceptions of 64 individuals in bottom man- 
agement were compared with those of 75 
individuals in middle management jobs. The 
data were obtained by a questionnaire that 
consisted of 13 common personality traits 
arranged in 78 forced-choice pairs, where 
each trait was paired once with every other 
trait. The respondents were asked to check 
the one word in each pair that they thought 
was relatively more important for success in 
their particular management positions. The 
results showed the following: a high correla- 
tion between the trait rankings derived from 
the selections of the lower-level managers and 
those obtained from the middle-level man- 
agers; a high selection of traits indicating 
cooperativeness relative to traits indicating in- 
dependence, within both management levels; 
and a moderate trend for the cooperative-type 
traits to be perceived as relatively more im- 
portant for bottom management jobs than for 
middle management jobs. 
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EFFECT OF SIMULATED APPLICANT STATUS 
ON KUDER FORM D OCCUPATIONAL 
INTEREST SCORES 


C. S. BRIDGMAN anp G. P. HOLLENBECK 


Bureau of Industrial Psychology, University of Wisconsin 


Kuder (1950, 1957) has developed an 
interest inventory (Occupational, Form D) 
which provides scales for various occupations. 
He suggests in the manual (1956a, p. 5) that 
these scales and others which could be de- 
veloped for additional occupations might be 
of assistance in selection of industrial per- 
sonnel. 

Each Kuder scale is developed by selecting 
items which differentiate maximally between 
a group already employed in the given oc- 
cupation and a base group representing em- 
ployed people in general (Kuder, 1956b). 
Applicants presumably would be influenced 
by a desire to make a good impression to a 
greater extent than would individuals who 
are already established in the occupation. 
Therefore, use of such an inventory for 
selection purposes raises the problem of re- 
sponse bias. Kuder provides a_ verification 
scale to help identify respondents who either 
have answered incorrectly or carelessly, or 
who may have answered insincerely. His data 
indicate relatively little overlap between 
verification scores obtained when subjects re- 
sponded sincerely and when they were in- 
structed to make a good impression. For 
example, only 10° of a group of 50 college 
students obtained verification in the 
“acceptable” range, when instructed to con- 
ceal their faking while giving “best impres- 
sion” responses (Kuder, 1956a). 

The question still remains whether scores 
can be biased in the desired direction on such 
occupational scales, and if so, whether the 
bias will be revealed by a shift in verification 
scores, When respondents answer under condi- 
tions more closely approximating the applica- 
tion situation, i.e., where the set for faking 
is less firmly and explicitly established than 
is presumably the case with the “best impres- 
sion” set which Kuder used to demonstrate 
the effectiveness of the verification scale. 
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MrtTHOD 


To explore these and related questions, Kuder’s 
Form D was administered to four groups of students 
in elementary psychology classes under instructions 
outlined below. Groups were asked to fill out the 
interest inventory as they would if applying for a 
specific sales job (sanitary supply salesman), the 
job of industrial psychologist, and an unspecified 
“job in industry.” The salesman job was described 
briefly for the first group, and the psychologist job 
was described briefly for the second group. In each 
case the subjects were told to assume that they had 
the necessary background and were interested in 
obtaining the position. However, no explicit instruc- 
tions were given to make 
falsify their responses.' 


a good impression or to 
For comparison purposes a 
fourth group was given vocational guidance instruc- 
tions, i.e., they were asked to complete the inventory 
accurately in order to help in 
vocational choice. 

The answer choices these groups were 
scored on a sanitary supply salesman scale (de- 
veloped at the University of Wisconsin) and on the 
Kuder industrial psychologist scale. Verification scores 
were also obtained 


obtain making 


a 


of four 


By comparisons among scores of the experimental 
groups and available reference groups, we have been 
able to consider the following questions: 


1. Can students approximate the responses of indi- 
viduals actually employed in the occupations under 
consideration, after instructions to respond as though 
they were interested applicants ? 

Are the assuming the role of job 
applicant specific to the particular occupation, or are 
these effects the result of a generalized effort 


good”? 


effects of 


to 
“look 

3. Do the unbiased scores of student groups differ 
from those of Kuder’s base group on these occupa- 
tional scales? 

4. Can Kuder’s verification scale differentiate be 
tween biased and unbiased groups when the bias set 
is established through instructions to assume the role 
of job applicant ? 


RESULTS AND DISCUSSION 


The mean scores for groups instructed to 
act as applicants for sales and for psychologist 
jobs did not differ significantly from the mean 
of 


1 Copies the detailed instructions can be 


obtained from the Bureau of Industrial Psychology. 
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TABLE 1 
MEAN OccuPATIONAL INTEREST SCORES AND STANDARD 
DEVIATIONS FOR Groups GIVEN SpEcIFIC JoB 
INSTRUCTIONS AND FOR THE CORRESPONDING 
OccUPATIONAL GROUPS 


Scale Group VN Mean SD 
Sales instructions 70 71.58 10.5 
Salesman 
Actual salesmen 50 73.1 9.6 
»syc ist ins ‘tions 50 53.28 7.9 
Industrial Psychologist instruction 7 
Psychologist Actual psychologists 200 54.5 9.7 
® Mean not significantly different from mean of corresponding 


occupational group. 


scores for the corresponding occupational 
groups (Table I).? The students, simply by 
assuming an applicant set based on a brief 
job description, obtained distributions of 
scores comparable to those of individuals 
actually employed in the two occupations. 

Among the experimental groups, the highest 
occupational interest scores on each key were 
obtained by the group given specific instruc- 
tions appropriate to the key (Table 2). Non- 
specific instructions to apply for a job in 
industry were significantly less effective than 
the specific instructions appropriate to the 
key, but did produce mean scores significantly 
higher than vocational guidance (sincere) 
instructions. / 

It might be’argued that presentation of the 
salesman anil psychologist job descriptions 
merely estabjished more firmly a generalized 
set to look geod, in comparison to the job-in- 
industry instructions, and that this could 
account for the higher scores. However, evi- 
dence for the specificity of the effects of the 
specific job instructions was found when the 
groups given these instructions were scored 
on the noncorresponding scale (i.e., salesman 
instruction group scored on_ psychologist 
key, and psychologist instructions group 
scored on salesman key), as shown in the 
appropriate cells of Table 2. Salesman in- 
structions were only as effective as nonspecific 
job instructions in increasing scores on the 

2Analyses of variance’ indicated significant 
(p< 01) differences between means for the groups 
in Table 1 on both occupational scales and the 
verification scale. Subsequent ¢ tests were employed 
for differences between specific means. A probability 
level of less than .05 was accepted as significant. 


Significant differences between specific means are 
indicated in the text and tables. 


C. S. Bridgman and G. P. Hollenbeck 


psychologist scale. On the other hand, psy- 
chologist instructions did not produce an in- 
crease in scores on the salesman scale above 
the mean obtained by the vocational guidance 
group. 

It may be concluded that the specific job 
instructions produce answer choices appropri- 
ate to the given occupation. The resulting 
scores cannot be attributed simply to a more 
effectively established general set to make a 
good impression. 

The vocational guidance group did not 
differ significantly from a sample of Kuder’s 
representative employed group when both 
were scored on the sales interest scale.* (Data 
for Kuder’s group: N = 97, mean = 60.7, 
standard deviation = 9.9. Data for our group 
is in Table 2.) However, the industrial psy- 
chologist scores of these two groups differed 
markedly. The mean of the students (44.6) 
fell approximately halfway between that of 
the representative employed group (33.2) and 
that of the actual industrial psychologists 
(54.5). (Data for Kuder’s group: N = 97, 
mean = 33.2, standard deviation = 9.8. Data 
for the other two groups can be found in 
Tables 1 and 2.) 

It is not particularly surprising that, in 
their unbiased responses to this inventory, 
our students are more like industrial psy- 
chologists than is Kuder’s representative em- 
ployed group. However, this finding raises an 
important question as to the appropriate base 
group which should be used in developing 
these specialized occupational interest scales. 


3 The employed group is a sample of 97 whose 
answer sheets were obtained from Science Research 
Associates. 


TABLE 2 
MEAN OcCUPATIONAL INTEREST SCORES AND STANDARD 


DEVIATIONS FOR EXPERIMENTAL GROUPS 


Scoring Key 
Salesman Psychologist 
Type of Set Instructions V Mean SD Mean SD 
Salesman 70 71.5% 10.5 48.7 9.2 
Specific 
Applicant Psychologist 614 a2 7.9 
Nonspecific 
Applicant Job in industry 55 64.8% 69 489 9.1 
Sincere Vocational 
guidance 100) 61.8 91 44.6" 8.9 
* Mean significantly different from the means of the three 


remaining groups. 
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Applicant Status and Interest Scores 


TABLE 3 
VERIFICATION MEAN SCORES AND STANDARD 
DEVIATIONS OF EXPERIMENTAL AND 
COMPARISON GROUPS 


Mean 


Salesman 
Psychologist 

Job in industry 
Vocational guidance 


46.8"! 
45.5! 
51.1» 


Instruction 
Group 


Comparison 
Group 


Kuder’s representative 
employed 54.0 


* Significantly different from vocational guidance mean 
» Significantly different from representative employed mean 


If the scale is to be used to discriminate 
among applicants or potential applicants, 
then samples of applicants can be suggested 
as appropriate for use as a base in developing 
the occupational scale. At least this procedure 
should be considered when there is any reason 
for suspecting that the responses of the appli- 
cants will differ systematically from Kuder’s 
representative employed group. 

The verification mean scores of the bias 
groups were significantly lower than that ob- 
tained by the vocational guidance group, 
except in the case of job in industry set, 
as shown in Table 3. (The scale is designed 
to give lower scores for individuals who are 
trying to make a good impression.) However, 
the observed means for our applicant set 
groups are not nearly as low as those found 
by Kuder for best impression responses, since 
he has reported mean scores in the vicinity 
of 35 for a number of groups responding 
under this set. Another Kuder group, in- 
structed to conceal its bias, obtained a mean 
of 41 on the verification scale (Kuder, 
1956a). The overlap of the distributions of 
the sample of Kuder’s representative em- 
ployed group and his group instructed to 
conceal its bias was 16°. However, the over- 
laps between the vocational guidance instruc- 
tions group and the salesman, psychologist, 
and “job in industry” instructions groups were, 
respectively, 67°, 54°, and 82%. Thus it 
must be concluded that effective bias of oc- 
cupational scores can be achieved without a 
sufficiently large shift in verification scores 
to ensure identification of the presence of bias 
in individual cases. The possibility remains 
that more discriminative modified verification 
scales could be developed for use with each 
occupational scale. 


SUMMARY 


Separate groups of college students were 
instructed to assume they were applying for 
specific jobs (sanitary supply salesman and 
industrial psychologist). When scored on the 
appropriate scale of Kuder’s interest inven- 
tory (Form D), both groups obtained scores 
comparable to the groups employed in these 
occupations. 

College students instructed to apply for an 
unspecified “job in industry” showed signifi- 
cantly higher means on both scales than a 
group given vocational guidance instructions, 
indicating that some part of the bias noted 
above can be introduced by such a general 
nonspecific set. However, evidence was pre- 
sented that the instructions to apply for spe- 
cific jobs produced responses appropriate to 
the specified occcupation, rather than simply 
inducing a more effective nonspecific set. 

College students given vocational guidance 
instructions obtained scores comparable to 
the base group (represerting employed people 
in general) on the sales interest scale, but 
scored significantly above the base group on 
the industrial psychologist scale. This result 
was interpreted as implying the need to use 
a base group comparable to the applicant 
group when it is the purpose of the investi- 
gator to develop Kuder-type interest scales 
to be used for selection purposes. 

Kuder’s verification scale differentiated 
significantly between the groups responding 
with an applicant set and the vocational 
guidance group. However, the differentiation 
was not nearly as effective as reported by 
Kuder between sincere and best impression 
groups. The differentiation was not sufficient 
to warrant use of the verification scale in the 
manner recommended by Kuder. 
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The past few years have seen the large- 
scale usage, on both the experimental and op- 
erational levels, of many psychometric de- 
vices for selecting students for a variety of 
professional training programs. Most of the 
experimental programs have tended to con- 
centrate upon academic success, as measured 
by grades or other rating procedures, as the 
principal criterion for empirically evaluating 
the usefulness of these selection procedures. 

Since professional success is virtually al- 
ways contingent upon the completion of an 
academic training program, this concentration 
upon academic success is obviously useful and 
important. But the relationships between aca- 
demic success and later professional success, 
beyond the direct effects of the pass-fail di- 
chotomy in completing training, are mainly 
ignored. Since all professions seem to circu- 
late anecdotal reports of the mediocre student 
who makes a significant professional contribu- 
tion, as well as of the brilliant student who 
sinks into obscurity, it is probable that the 
relationships between academic and _profes- 
sional success are neither simple nor direct. 

The difficulties in tracing the careers of 
professional persons, especially the difficulties 
in evaluating relative success in a profession, 
have resulted in a paucity of follow-up studies 
and, consequently, very little is known either 
about the relationships of academic and pro- 
fessional success or about the usefulness of 
tests in predicting later professional success. 
The present paper is an attempt to add to our 


1 This research was completed while the senior au- 
thor, on leave from the University of Iowa, was serv 
ing as a Research Consultant to the University of 
California Counseling Center. He has now 
to the University of Iowa, Iowa City 

The authors are indebted to Roger W. Cummings 
for his assistance with the statistical analvsis of the 
data. William Griffiths, Dorothy B. Nyswander, and 
Beryl Roberts of the University of California School 
of Public Health, Berkeley, were of invaluable as- 
sistance in securing the test protocols and in provid- 
ing some of the rankings. 
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meager knowledge about these problems by 
studying such relationships in one profession, 
that of the public health educator. 

Barthol and Kirk (1956) have reported 
upon the successful development of a psycho- 
metric battery for selecting graduate students 
in public health education, with faculty rank- 
ings of academic progress used as the cri- 
terion. The students with the best background 
(combined years of prior work experience 
and/or prior academic training in public 
health), with the highest previous achieve- 
ment in public health (scores above the norm 
group mean on the American Public Health 
Association—APHA—Examination) , and with 
the highest levels of mental ability (scores 
above 100 on the Concept Mastery Test— 
Terman, 1956) were rated as the best stu- 
dents in both the classroom and in field work 
placements. Those students whose measured 
interests were in working with people (high 
Strong Vocational Interest Blank—-SVIB— 
scores in the Group V, welfare, and Group X, 
verbal-linguistic, occupations) and those stu- 
dents who were relatively free from person- 
ality disturbances (no score above 70 on the 
Minnesota Multiphasic Personality Inventory 
—MMPI—except on the M/ scale) were also 
rated as the better students but these latter 
results were less statistically reliable. The pur- 
pose of the present study is to investigate the 
relationships between these test indices of 
academic success and later on-the-job success 
as a public health educator. 


MertTHOD 
Subjects 


Of the 20 students in the first class of the Barthol 
and Kirk (1956) study, 19 had graduated and were 
used as the Ss in the present study. All 19 had re 
ceived the MPH degree from the University of Cali 
fornia and had been out of school for 6 years at the 
time this follow-up study was initiated. The twen- 
tieth student in the original study had contracted 
polio and had not completed the program 
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Graduate Students in Public Health Education 


Procedure 


The professional work history during the 6 years 
following graduation was obtained for each of the 
19 Ss from the records of the School of Public 
Health. In addition, each graduate received a written 
request from a member of the school faculty to again 
take the SVIB and MMPI. Since all 19 Ss complied 
with this request, there are two sets of SVIBs and 
MMPIs available, one obtained at the time of en- 
try into the school and one obtained 6 years after 
graduation. 

Each of the four sets of profiles with the Ss’ names 
removed was separately and independently ranked 
along a continuum of “potential success as a public 
health educator” by counseling psychologists 
with considerable experience in the use of these in- 
struments. The reliability of these ratings was shown 
by the high interjudge agreement (rho’s from .76 to 
82) between the raters 

Seven trained counseling psychologists, all of them 
having some familiarity with the field of public 
health education, then independently ranked the 
anonymous 19 work histories along a continuum of 
“over-all success in public health education.” The 
subsequently obtained mean rank for each individual 
was then used as the criterion measur¢ 
sional success. The obtained rank-order correlations 
between each judge and the ranking of the summed 
ranks (minus that particular judge’s ranking) ranged 
from .48 to .93 with a median of .84. These results 
obtained between seven independent judgments sug 
gest that the rankings are reliable enough to be used 
as the criterion measure 

Two of the three faculty judges who had ranked 
academic success in the original study 


two 


ol profes- 


(grades were 
not used because of the restriction in range at the 
graduate level) also ranked the over-all success of 
these 19 Ss with a rank-order correlation of .87 be 
tween their two rankings. It was decided not to use 
their current rankings as the criterion measure, be 
cause these current rankings may have 
taminated by the prior (1952) academic rankings 
Nevertheless, the rank-order correlation between the 
mean ranking by the faculty 
terion ranking was .78, 
mon variance 


been con- 


judges and the cri 
indicating considerable com- 


RESULTS 

The rank-order (rho) correlations between 
each of the tests and the 1959 professional 
success criterion are presented in Table 1. 
The rho’s previously reported by Barthol and 
Kirk (1956) between the tests and the 1952 
academic success criterion are also presented 
in Table 1 for the purpose of comparison. 

Table 1 indicates that those students with 
the best background in public health and 
whose entry (1952) MMPI profiles were rated 
as showing the best potential were ranked as 
the most professionally successful. Both of 


TABLE 1 


RANK-ORDER CORRELATIONS OF TesT DATA 
AND CRITERIA OF SUCCESS 


1959 
the-job 


Predictor Variables 


+MMPI 


astery 


Background xp 
1952 MMPI ranking 
1952 SVIB rankir 
1959 rank 
1959 SVIB rar 

1952 Academi 


above 70 on any 
15 


1 Not pres 


then 


these predictors were equally successful in 
predicting the 1952 academic success ranking, 
although these particular MMPI evaluations 
had not been previously examined. The scores 
on the APHA examination, the best predictor 
of academic success, do not significantly cor- 
relate with the professional success Criteria. 
The SVIB global rankings, which also had not 
been previously examined but which are nega- 
tively and significantly correlated to the aca- 
demic criterion, are negatively but not sig- 
nificantly related to the professional success 
criterion. It is noteworthy that the criterion 
rankings of academic success by the three 
faculty judges correlated .70 with the criterion 
of professional success, making these aca- 
demic rankings the best predictors of later 
professional success. 

The rankings of the current (1959) MMPI 
and SVIB profiles are not significant!y corre- 
lated with the professional success criterion. 
Both sets of mean MMPI profiles are rela- 
tively flat with the mean 7 scores on the nine 
clinical scales ranging from 49 to 61. While 
a comparison of the 1959 mean MMPI pro- 
file with its 1952 counterpart indicated virtu- 
ally no change except for a small, nonsignifi- 
cant drop on Hypochondriasis (Hs), an in- 
spection of the individual profiles suggested 
considerable change. In an effort to assess this 
change, the two mean MMPI profiles of the 


six most successful Ss were compared with 


those of the six least successful Ss (using 
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the 1959 criterion). The most successful Ss 
showed an increase over time in the mean 
score on every scale but Hs, with the largest 
increases on the Psychasthenic (Pt), Psycho- 
pathic deviate (Pd), and Depression (D) 
scales, while the least successful Ss showed a 
decrease over time in the mean score on every 
scale with the largest decreases on Hs, Schizo- 
phrenia (Sc), Pt, and Mania (Ma). These 
interactional differences are significant (p 
> .05), however, only in the case of the Hs, 
Pt, and Sc scales. Thus, although there are no 
differences between the over-all mean 1952 
and over-all mean 1959 MMPI profiles, it 
would appear that the most successful Ss 
show a rise in their profiles while the least 
successful Ss show a decline in their profiles. 
This change is most striking in the case of the 
Pt scale where the most successful group in- 
creases an average of 8.8 T score units while 
the least successful group decreases an aver- 
age of 9.14 units. 

Both sets of SVIB profiles demonstrate a 
primary pattern in the Group V (welfare) oc- 
cupations with the highest scores on Public 
Administrator, Social Science High School 
Teacher, Social Worker, and Minister keys. 
A comparison of both the two mean profiles 
and the individual profiles does not suggest 
that any striking changes have occurred in 
the pattern of measured interests. There was 
some slight tendency for scores in the Group I 
(scientific) and Group II (technical) occupa- 
tions to decrease and for scores in all the other 
groups to increase in time with the largest in- 
creases in Group V (welfare) especially on 
the Minister key, and in Group VII, Certified 
Public Accountant. 

None of the specific SVIB indices, i.e., pri- 
mary patterns in the Group X (verbal-lin- 
guistic) occupations, Occupational Level (OL) 
scores above 55, or Masculinity-Femininity 
(MF) scores for men below 49, which had 
been found to be related to the academic suc- 
cess criterion, were related to the later pro- 
fessional success criterion, either over the en- 
tire range of success or between the two ex- 
treme groups. 


DISCUSSION 


The present study indicates once again that 


Leonard D. Goodstein and Barbara A. Kirk 


a selection testing program, even one carefully 


designed and logically selected, must be evalu- 
ated not only against the empirical criterion 
of training success but also against the em- 
pirical criterion of later on-the-job success. 
Those graduate students in public health edu- 
cation who come with the most prior related 
academic and work experience (those with 
tested motivation and aptitude in the field) 
and who appear most personally stable are the 
most successful, both as students and as later 
professional public health workers. While the 
measured level of achievement in public health 
(APHA examination) prior to training is a 
predictor of academic success, it is not a pre- 
dictor of professional success. Perhaps the 
formal training program as well as the subse- 
quent on-the-job experiences diminishes the 
importance of this pretraining knowledge. 

It is quite interesting to note that, in this 
study, academic success is the best predictor 
of professional success. However, since most 
professional placements, especially those made 
relatively early in professional careers, are 
largely dependent upon faculty recommenda- 
tions, this may not be a surprising finding. 
More data on the relationship of these two 
kinds of evaluation in this and in other pro- 
fessional fields are obviously necessary. 

A comparison of the professional work ex- 
perience of those six Ss ranked as the most 
successful by the judges with that of the six 
ranked as least successful reveals quite differ- 
ent, patterns of job activities. Those judged 
most successful showed a gradual increase in 
responsibility through the 6 years, evidenced 
either by assuming a supervisory title or a po- 
sition of broader scope, e.g., one involving a 
national rather than state program. The top 
six Ss had held a total of 16 different profes- 
sional positions during the 6-year period while 
the bottom six held only eight different po- 
sitions. Three of the top six were currently 
students working toward a doctorate, one was 
a college professor, and one was involved in a 
national program, while one of the bottom six 
was out of the field, none was involved in ad- 
ditional academic training, and none was op- 
erating with national programs. 

It is believed that the increase in stress ac- 
companying these positions of greater respon- 
sibility is the crucial factor in the greater mal- 
adjustment evidenced in the retest MMPIs of 
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the top six Ss. These six individuals were gen- 
erally more tense and anxious, as well as more 
active, than they were as students. Contrari- 
wise, the better “adjustment” evidenced by 
the bottom six Ss may be seen as a conse- 
quence of their finding a level of occupational 
activity involving less stress and then working 
successfully at that level. 

The results with the SVIB indicate that 
public health education involves a primary 
interest in welfare occupations with a strong 
secondary interest in verbal-linguistic and 
business-contact occupations. Strong scientific 
and technical interests are apparently not 
necessary for success in this field despite the 
a priori assumptions to the contrary. The high 
weight implicitly given to scientific and tech- 
nical interests in the ranking of the SVIB pro- 
files, together with a failure to give sufficient 
weight to the verbal-linguistic and business- 
contact interests, is undoubtedly responsible 
for the negative relationships between the 
SVIB ranking and the success criteria. While 
these interest patterns are clearly understand- 
able in terms of a post facto analysis of the 
career patterns, they had not been anticipated. 


SUMMARY 


The purpose of the present study was to in- 
vestigate the relationship of certain test in- 
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dices of academic success in a graduate cur- 
riculum, previously reported by Barthol and 
Kirk (1956), and later, on-the-job success 
as a public health educator. Six years after 
graduation, those students who had the most 
background in public health prior to entering 
the program and who were rated as having 
the best adjusted MMPI profiles were rated 
as the most successful by seven counseling 
psychologists who had had some experience 
with the training program. The prior, inde- 
pendent ratings of academic success in the 
program by three faculty judges were, how- 
ever, the best predictors of professional suc- 
cess. Certain changes in both the MMPI and 
SVIB profiles, which were obtained both at 
the time of entry into training and again 6 
years after graduation, were compared to the 
criterion of success and were discussed in 
light of certain patterns in the career de- 
velopment of these 19 public health educators. 
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There are educational programs in which 
the interpersonal as well as the intellectual 
aspects of students’ performances are consid- 
ered in the final determination of grades. Col- 
legiate programs of nursing are a case in 
point; students of nursing are judged, partly 
on the basis of their ability to learn the ver- 
bal subject matter of the curriculum, but also 
on the degree to which they have mastered 
manual and interpersonal skills. The present 
study was undertaken to determine whether 
a measure of one aspect of nursing students’ 
interpersonal behavior—their self-disclosure to 
parents and peers—when obtained early in the 
students’ careers, would predict their grade- 
point averages at the conclusion of their pro- 
grams. 


MetTHOopD 


Measurement of Self-Disclosure. Self-disclosure re- 
fers to the act of revealing personal information 
about the self. A questionnaire method for assessing 
the extent to which subjects have revealed various 
categories of information to selected “target-persons” 
had previously been found to have both reliability 
and at least concurrent validity (cf. Jourard & Lasa- 
kow, 1958). Two additional studies with a much ab 
breviated questionnaire (Jourard, 1959; Jourard & 
Landsman, 1960) showed that the instrument had 
some measure of predictive validity. For the present 
investigation, a questionnaire of intermediate length, 
listing 25 items of personal information (see Table 1), 
was prepared for administration according to the 
following instructions: 


Indicate on the special answer-sheet the extent to 
which certain other people know the information 
listed on this questionnaire through your telling it, 
or confiding it to them. If you are certain that the 
other person knows this information fully—so that 
he or she could tell someone else about this aspect 
of you—write the number 7 in the appropriate 
space. If the other person does not know this in- 
formation fully, having only a vague idea, or in 
complete knowledge, write in a zero. Remember do 
not write in a 7 unless you are sure that you have 
given this information to the other person in full 
enough detail that they could describe you accu- 
rately in this respect to another person. 


SELF—DISCLOSURE SCORES AND GRADES 
IN NURSING COLLEGE 


SIDNEY M. JOURARD 


University of Florida 
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The answer sheet was ruled with 25 rows, and 4 
columns headed, respectively, Mother, Father, closest 
Male Friend, and closest Female Friend. 

Previous unpublished study had shown that the 
odd-even reliability coefficients for the “target per- 
son” subtotals were in the .80s, and the odd-even 
reliability coefficient for the entire questionnaire was 
93. 

Subjects and Procedure. The self-disclosure ques- 
tionnaire was administered to 46 sophomore students 
of the University of Florida College of Nursing dur- 
ing a regular classroom session. Median age of this 
group was 20 years. 

By the time this group had become seniors, attri- 
tion had reduced the N to 23. Following the comple- 
tion of the senior academic year, grade-point aver- 
ages of these 23 subjects were calculated for (a) all 
nursing courses taken during the 4 years of study, 
(6) nursing courses taken in the junior and senior 
years, (c) all nonnursing courses taken during the 
4-year program, and (d) all courses combined. Prod- 
uct-moment correlations were calculated between 
these grade-point averages and the self-disclosure 
scores obtained 2 years earlier. 


RESULTS 


Table 2 shows the correlations between the 
various grade-point averages and the self-dis- 
closure scores. Significant r’s were found be- 
tween total disclosure scores (see Column 5) 
and the three sets of grade-point averages in 
which nursing courses figured. The correlation 
between total self-disclosure score and grades 
for nonnursing courses did not reach statisti- 
cal significance. Table 2 also shows correla- 
tions between each of the target subtotals 
and the various grade-point averages. It may 
be seen that the highest r’s were found be- 
tween scores for disclosure to Mother and 
the three grade-point averages which in- 
cluded nursing courses. The r’s between dis- 
closure to Female Friend and nursing grades 
were slightly lower, and those between dis- 
closure to Father and the two sets of nursing 
grades were lower still, but still within the 
range of statistical significance. The r’s_ be- 
tween disclosure to Male Friend and grades 
were all insignificant, as were those between 
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Self-Disclosure Scores and College Grades 


PABLE 
DISCLOSURE 


Pure 
What you like to do most in your spare time at 


home, e.g., read, sports, go out, et« 


The kind of party or social gathering that you 


most 


Your usual and favorite spare-time reading ma 


terial, e.g., novels, nonfiction, science fiction, 


poetry, etc, 


lhe kinds of music that you enjoy listening to most, 


e.g., popular, classical, folk-music, opera 


rhe sports you engage in most, if any, e.g., golf, 


swimming, tennis, baseball, etc. 


Whether or not vou know and play any card games, 


e.g., bridge, poker, gin rummy, etc. 


Whether or not you will drink alcoholic beverages; 


if so, your favorite drinks—beer, wine, gin, brandy, 


whisky, etc. 


. The foods you like best, and the ways you like fooc 
prepared; e.g., rare steak, etc. 


. Whether or not you belong to any church; if so, 


which one, and the usual frequency of attending 


Whether or not you belong to any club or frater 


nity, civic organizations; if so, the names of these 


organizations 


(ny skills you have mastered, e.g., arts and crafts, 


painting, sculpture, woodworking, auto repair, 


knitting, weaving, et 
Whether or not vou have any favorite spectator 
sports; if so, what these are, e.g., boxing, wrestling, 


football, basketball, etc 


The places that you have traveled to, or lived in 


during vour life—other countries, cities, states 


What vour political sentiments are at present 
your views on state and federal government policies 


of interest to vou 


15. Whether or not you have been 


1 


(QUESTIONNAIRE 


seriously in love 

during your life before this year; if so, with whom, 

what the details were, and the outcome 

The names of the people in your life whose care and 

happiness you feel in some way directly responsible 

for 

The personal deficiencies that you would most like 

to improve, or that you are struggling to do some 

thing about at present, e.g., appearance, lack of 

knowledge, loneliness, temper, 

Whether or not you presently owe money; 

how much, and to whom 

The kind of future you are aiming toward, working 

for, planning for—both personally and vocationally, 

e.g., marriage and family, professional status, ete 

. Whether or not you are now involved in any projects 
that you would not want to interrupt at present 
either socially, personally, or in your work; what 

these projec ts are 

rhe details of your sex life up to the present time 

including whether or not you have had, or are now 

having sexual relations, whether or not you mastur 

bate, etc 

Your problems and worries about your personality, 

that is, what you dislike most about yourself, any 

guilt, inferiority feelings, etc 

How you feel about the appearance of your body 

your looks, figure, weight—what you dislike and 

and what you accept in your appearance, and how 

you wish you might change your looks to improve 

them 

Your thoughts about your health, including any 

problems, worries, or concerns that you might have 

at present 

An exact idea of your regular income (if a student, 

of your usual combined allowance and earnings, if 


any) 


TABLE 2 


ELATIONS BETWEEN SELF-DIscLosurRi 


Grade-Point Averages Mother 
\ll nursing courses 

Junior and senior nursing courses 

\ll nonnursing courses 


All courses combined 


At .05 level: 7 


O05 level: 


SCORES 


Father 


AND GRADE-POINT AVERAGES IN NURSING COLLEGE 


Disclosure Scores 


Male 
Friend 


Total 


Disclosure 


Female 
Friend 


29 


38 


59** 
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the various target subtotals and grade-point 
average for nonnursing courses. 


DiIsCUSSION 


The self-disclosure scores may be presumed 
to reflect the degree to which the subjects 
actually have engaged in self-revealing com- 
munication to significant others in their lives. 
The present findings strongly suggest that 
this type of activity prepares a nursing stu- 
dent to engage in the kinds of behavior which 
will earn her the highest grades in nursing 
college. It is interesting that the highest cor- 
relations were found between scores for dis- 
closure to Mother and grades in nursing 
courses. This finding implies that experience 
at communicating openly with one’s mother is 
good preparatory practice for communication 
with other female authority figures, viz., the 
faculty of the college of nursing. Support for 
this interpretation is provided by the fact that 
in the particular college from which the sub- 
jects were drawn, the students were required 
throughout the program to reveal their per- 
sonal reactions to books and articles they 
had read, and patients they had dealt with, 
through the media of classroom discussion and 
written “reaction reports.” It is likely that 
those students who were the most “open”’ in 
such communication impressed the faculty 
most favorably, and hence earned the higher 
grades. 

Another factor which the faculty considered 
in assigning course grades to students was the 
observed facility with which the students in- 
teracted with patients. An exploratory study 
showed that 17 sophomore and junior stu- 
dents who received high ratings from their 
clinical instructors on “ability to establish 
close, communicating relationships with pa- 
tients” had higher total disclosure scores (p 
< .05) than 17 matched students who were 
rated poor on this ability. The self-disclosure 
questionnaires had been administered a year 
prior to the time of rating. This finding too 
suggests that the higher-disclosing students 
were best able to elicit disclosure from pa- 
tients, producing thereby a favorable impres- 
sion upon their instructors. 

That the self-disclosure scores are not in- 
dices of intelligence, or of more general aca- 
demic aptitude, is attested by an insignificant 


correlation of .07 that was found in a sample 
of 52 freshmen nursing students between total 
score on the ACE Psychological Examination 
and total self-disclosure scores. Moreover, the 
correlation in the present study between self- 
disclosure scores and grades in nonnursing 
courses was not significant, suggesting that 
the attributes measured by the self-disclosure 
questionnaire played a lesser role in perform- 
ance in more strictly academic courses. 

The question may be raised whether the 
self-disclosure questionnaires employed here 
could have predicted which subjects would 
fail in the program, or would leave it for 
other reasons. A comparison was made be- 
tween the self-disclosure scores of 34 students 
who dropped from the nursing program prior 
to their senior year and those available from 
37 juniors and seniors tested at the same time. 
Mean total disclosure score of the dropout 
group was 59.60, SD 15.79, while that for the 
continuing group was 62.94, SD 15.86. The 
difference between means was not statistically 
significant. Inspection of the scores of those 
students who failed did not show any con- 
sistent trend toward higher or lower mean dis- 
closure scores than those found in students 
who left school for reasons of marriage, or 
who changed courses; nor did the self-dis- 
closure scores of the failing students differ 
significantly from the mean total of the senior 
class. This observation is not intended to be 
conclusive, however; it is possible that study 
of a larger sample of failing students might 
point up some trends that were not apparent 
here. 

Nursing is not the only profession in which 
the ability to establish close relationships with 
others is a desired trait; counseling, psy- 
chotherapy, teaching, military and industrial 
leadership all require some measure of inter- 
personal competence. The present findings 
provide further evidence for the predictive 
validity of self-disclosure questionnaires, and 
suggest that they may have promise of pro- 
viding a measure of one of the important non- 
intellective factors which might predict suc- 
cess in programs of training for these voca- 
tions. 

SUMMARY 


A self-disclosure questionnaire was admin- 
istered to a group of students of nursing dur- 
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Self-Disclosure Scores and College Grades 


ing their sophomore year. At the conclusion 
of their senior years, grade-point averages 
were calculated for (a) all nursing courses 
taken during the 4 years of study, (4) nurs- 
ing courses taken in the junior and senior 
years, (c) all nonnursing courses taken dur- 
ing the 4-year program, and (d) all courses 
combined. Significant correlations were found 
between the scores for disclosure to Mother, 
Female Friend, and Total Disclosure, on the 
one hand, and all grade-point averages in 
which nursing courses were included. Disclo- 
sure to Father was significantly correlated 
with grades for all nursing courses, and grades 
in nursing courses taken in the junior and 


247 


senior years. Disclosure to Male Friend was 
not significantly correlated with any of the 
grade-point averages. 
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There is an almost universal tendency to 
speak of green or blue as “cool” colors, and 
of red or orange as “warm” (von Allesch, 
1925). When colored lights are used to illumi- 
nate stage scenes, the “cool” hues give the au- 
dience the impression of a low temperature on 
stage, while the illusion of heat is produced 
by “warm” lighting (Ross, 1938). 

Can a person’s judgment of the tempera- 
ture of the air around him -be biased by the 
hue of his surroundings? Can this conven- 
tional association between colors and tempera- 
ture be used to improve the comfort of indi- 
viduals? 

Despite anecdotal evidence that this is pos- 
sible, the only experimental study with direct 
bearing on this question was conducted by 
Morgensen and English (1926). They asked 
subjects to judge the temperature of heating 
coils wrapped in paper of different colors. 
Subjects apparently judged the green ones as 
hottest. 

METHOD 


Because the use of the terms “warm” or “cool” for 
certain colors is so widespread, it seemed advisable 
to conduct the experiment so that subjects were un- 
aware of the experimenter’s interest in their judg- 
ments of comfort and temperature. Subjects were 
therefore instructed in a simple tracking task, and 
were led to believe that the experiment concerned 
the effects of colored illumination upon tracking per- 
formance. They were told that the special lights gen- 
erated considerable heat, and were therefore asked 
to indicate by a switch when the temperature rose 
to a point at which they began to feel uncomfortably 
warm. They were led to believe that the experi- 
menter needed this information to avoid having dis- 
comfort interfere with their performance on the 
tracking task. Shortly after they had turned on the 
signal switch, they were taken out of the experi- 
mental room for a rest while the room was cooled 
by exhausting the air from it. In fact, the light 
sources produced negligible heat, and the heating was 
caused by electrical blower-heaters concealed in the 
room, thus providing uniform heating for all colors. 


1 This work was conducted under contract with 
the Ford Motor Company, Purchase Order No. EP 
104076-W, and is published by permission. 


PAUL C. BERRY 


Psychological Research Associates, Matrix Corporation 
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EFFECT OF COLORED ILLUMINATION UPON 
PERCEIVED TEMPERATURE ' 


Five colors of light were used. Two “cool” colors, 
green and blue, were balanced with two “warm” 
colors, yellow and amber, and white light also was 
used. Each subject was given five tests, one with 
each color, with the orders of presentation counter- 
balanced so that at the completion of testing for 25 
subjects, each color had appeared at éach position 
of the sequence on five occasions. 

Following these five tests (in which the subject 
was unaware of our interest in temperature judg- 
ments) the subject was shown samples of the five 
colors that had been used and was asked to rank 
them in order of the amount of heat they had trans- 
mitted. 

Experimental Room. The tests were conducted in 
a specially constructed room 4’ X 10’, painted white. 
The ceiling of this room sloped from 8’ high at one 
end, where the subject was seated, to 5’ high at the 
far end, and was entirely covered by fluorescent 
lamps shining through removable filters and fluted 
glass diffusers. The tilted ceiling was used in order to 
direct the light directly toward the subject. 

The experimental room was located within the 
PRA building, which is supplied with air condition- 
ing maintaining the general ambient temperature of 
72°F and 50% relative humidity. A large vented ex- 
haust fan in the experimental room permitted re- 
moving the hot air generated during the tests, and 
replacing it rapidly with the cool air from the sur- 
rounding building. Heating of the room was pro- 
vided by six 1,000-watt blower-heaters concealed in 
the room, controlled to produce a rise of 2°F per 
minute in the room temperature. 

Measurements of temperature and humidity in the 
room were provided by wet- and dry-bulb remote- 
reading thermometers. 

Irrelevant Task. A task was required that would 
appear plausible for testing with the different colored 
lights, that would require the subject to look at the 
lighting, and that would be sufficiently interesting to 
prevent boredom during the testing. For this pur- 
pose, an American Automobile Association auto- 
trainer, Model 3539, was used.? 

In this auto-trainer, a standard set of automobile 
controls is used to direct the motion of a model car 
on an endless-belt roadway mounted in front of the 
controls and sloping upward away from the subject. 
This apparatus occupied the full length of the room. 
The low point of the sloping ceiling was immediately 
above the far end of the roadway, so that in look- 
ing down the road, the subject was forced to include 

* The auto-trainer was kindly supplied by special 
arrangement with the manufacturer, Allgaier Shops, 
Inc., Arlington, Virginia. 
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Colored Illumination and 


a considerable area of the lighting diffuser in his field 
of vision 

Subjects were told that the experiment, which was 
being conducted for the Ford Motor Company, con 
cerned the effect of colored lighting on “some of the 
skills related to driving.” They were given some ini- 
tial practice with the auto-trainer, and then were 
told that during the test runs an automatic 
device would record all the occasions in which the 
model car ran off the roadway. They were told that 
their score would be calculated in errors-per-mile, so 
that they should drive slowly and carefully. In fact, 
no scoring procedure for the driving ‘task existed. 

Lighting Used. The entire area of the ceiling was 
covered with 18 daylight white (Champion F40-D 
preheat, 40 watts) and 18 blue fluorescent tubes 
(Sylvania F40-T12/B/RS preheat, 40 watts). The 
white tubes used with the yellow, 
amber, and green filters, while the blue lamps were 
used alone to provide the blue source. Beneath the 
tubes were mounted racks for the inclusion of the- 
atrical gels to provide the color during the yellow, 
amber, and green sessions, and frosted glass filters to 
provide an evenly lighted expanse across the entire 
ceiling area. 


scoring 


were alone, or 


The gelatins were selected on the basis of appro 
priateness of color and approximately equal visual 
brightness when in Visual brightness was 
equated by the use of a Photronic Cell equipped with 
a Viscor filter ® which gives a response to different 
hues which closely matches the standard luminosity 
curve for eye sensitivity adopted by the Interna- 
tional Commission on Illumination. The sources and 
filters for the five colors were as follows: white 
unfiltered daylight fluorescent; blue—unfiltered blue 
fluorescent, dominant frequency 478 ma; green 
light fluorescent with Roscolene gel 9-43, dominant 
frequency 555 mu; yellow—daylight fluorescent, with 
Roscolene gel 9-6, dominant frequency 578 ma; am- 
ber—daylight fluorescent with Roscolene gel 9-11, 
dominant frequency 588 mu 

The intensity of the four colored lights was ap- 
proximately equal. Readings taken with a MacBeth 
illuminometer on the white surface of the roadway 
directly under the light diffusers indicated about 280 
apparent foot-candles for the colored lights. The un- 
filtered white light taken at the same position pro- 
duced a reading of 420 apparent foot-candles. 

Duration of Testing. Subjects were told to expect 
about 14 hours of testing. The duration of the test 
for a single color ranged from 4 to 29 minutes. Be- 
tween the separate colors, the subject was returned 
to the outer room (lighted by 
cent tubes) for 10 minutes 

Scoring Procedure. For each subject on each color, 
the temperature at the moment when the subject in- 
dicated the onset of discomfort was recorded from 
both the wet- and the dry-bulb thermometers. These 
readings give the temperature- 
humidity index (once called the “Discomfort Index’) 


place 


day 


warm-white fluores- 


were combined to 


3 This was kindly supplied by special arrangement 
with the Washington, D. C. office of the Weston 
Division of Daystrom, Ine. 


Perceived Temperature 


TABLE 1 


ANALYSIS OF VARIANCE ON TEMPERATURE-HUMIDITY 


INDEX FOR THRESHOLD OF DISCOMFORT 


Source SS dj 


Color 
Order 


17.36 
65.90 
Subjects 2004.34 
Residual (Inter 

actions con 

394.40 92 
3082.00 124 


founded) 
Total 


used by the United States Weather Bureau to evalu- 
ate the joint effects of temperature and humidity 
This index is calculated as 
I= 4(D+ W) + 15 

where D is the dry-bulb temperature and W is the 
wet-bulb temperature 

At the close of the tests, subjects were asked to 
rank the five colors according to the amount of heat 
they transmitted, with Rank 1 assigned to the hot- 
test, and Rank 5 to the coolest. 

Subjects. Subjects 
adults, 


were 25 paid volunteers, all 
all high school graduates, all able to drive an 
automobile, and none color-blind. They were 19 men 
and 6 women 


RESULTS 

The mean temperature-humidity index score 
reported at the onset of discomfort was calcu- 
lated both by colors and by order of presenta- 
tion. For the five colors, these scores were: 
white, 80.5; yellow, 81.2; amber, 81.6; green, 
81.0; blue, 81.3. As will be seen in the analy- 
sis of variance (Table 1) the effect of color 
is almost exactly equal to that expected by 
chance alone. 

The order of presentation did produce a 
small significant effect, later runs showing 
greater heat tolerance than early ones. The 
mean scores by order were: first presentation, 
80.3; second, 80.7; third, 80.8; fourth, 81.7; 
fifth, 82.3. 

TABLE 2 
MEAN RANKINGS OF 


COLORS ACCORDING 


To “Amount oF Heat TRANSMITTED” 


Color Rank 


White 
Yellow 
Amber 


2.64 
2.54 
2.50 
3.64 
3.68 


Green 
Blue 
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The results of the rankings by the subjects 
at the close of the experiment are shown in 
Table 2. This effect was tested for significance 
by the Kendall-Friedman ranks test, and is 
clearly significant (W = .186; p < .OL). 

It will be seen that the green and blue are 
ranked almost identically, and so are the yel- 
low and amber, while the white is at an inter- 
mediate position quite close to the yellow and 
amber. 

It might be supposed that in making these 
rankings after completion of tests, the subject 
was correctly recalling the temperature at 
which he had signaled the onset of discom- 
fort. However, the within-subjects correlation 
between actual temperature-humidity index 
score at onset of discomfort and subsequent 
ranking is only +.09, not significantly differ- 
ent from zero. 


CONCLUSION 


It may be concluded that: 

1. Subjects did not show any change in the 
levels of heat they would tolerate as a func- 
tion of the colors of illumination, and 


Paul 


Berry 


2. Subjects nevertheless persisted in the 
conventional belief that green and blue are 
“cool” colors when asked to rank the colors 
that they had experienced. 

It appears, therefore, that if colored illumi- 
nation were used to increase the comfort of 
persons exposed to uncomfortably warm con- 
ditions, the threshold of discomfort and hence 
the frequency of their complaints would prob- 
ably not be altered by the coloring (unless, 
of course, the coloring also produced a real 
change in temperature). On the other hand, 
it seems likely that a false belief in the effi- 
cacy of blue or green filters would be wide- 
spread. 
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NEED FOR ACHIEVEMENT AND RISK PREFERENCES 
AS THEY RELATE TO ATTITUDES TOWARD REWARD 
SYSTEMS AND PERFORMANCE APPRAISAL 
IN AN INDUSTRIAL SETTING 


HERBERT H. MEYER anp WILLIAM B. WALKER! 


General Electric Company, New York City 


This report covers the second phase of an 
exploratory study to investigate the possibility 
of improving predictions of certain attitudes 
and behaviors of individuals in an industrial 
setting by the use of measures designed to 
assess achievement motivation. In the first 
phase of this study, reported earlier (Meyer, 
Walker, & Litwin, 1961), it was found that 
managers in jobs with definite entrepreneurial 
characteristics scored significantly higher than 
specialists of comparable age, education, and 
job level, whose jobs were judged to be non- 
entrepreneurial in nature, on a thematic ap- 
perceptive measure of need for achievement (n 
Achievement). The managers also differed sig- 
nificantly from the specialists in showing pref- 
erence for intermediate level risks (near 50 
50), a behavior characteristic which Atkinson 
and others (Atkinson, 1957, 1958; Atkinson 
& Litwin, 1960; McClelland, 1958) have dem- 
onstrated is indicative of high achievement 
motivation or low fear of failure motivation. 

The second phase of the study, reported 
here, dealt with relationships between meas- 
ures of achievement motivation and reactions 
to two different types of salary plans and the 
General Electric Company performance ap- 
praisal program. Based on the implications 
of past research studies, most of which were 
conducted in academic settings, it was hy- 
pothesized that persons who score high on n 
Achievement, as contrasted to persons scoring 
low on this measure, would: 

1. Prefer a salary plan based on the merit 
pay or “pay for performance” philosophy, 
and would have a less favorable attitude to- 

1 The authors wish to express their appreciation to 
George H. Litwin of Harvard University, who col- 
laborated on the first phase of this study reported 
earlier (Meyer, Walker, & Litwin, 1961), for scoring 
the predictor measures and for his advisory assist- 
ance on this phase of the study 
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ward a “scheduled increases” type of salary 


plan where pay is based more on age and 
length of company experience 

2. Express more favorable attitudes toward 
the performance appraisal program which 
involves the periodic evaluation of results 
achieved on the job and the feedback of this 
information to the individual 

3. Be more likely to take action to improve 
performance on the basis of the performance 
appraisal feedback. 

Since past research studies have shown that 
the tendency to prefer intermediate type risks 
in a risk-taking situation has been found to be 
correlated with other indexes of high achieve- 
ment motivation and with low anxiety or 
“fear of failure’ motivation, it was also ex- 
pected that a measure of risk preferences 
would show significant correlations with the 
reactions listed above. 

Some of the evidence on which the above 
hypotheses are based is summarized by Mc- 
Clelland (1961). He cites research evidence 
to show that the persén high in achievement 
motivation prefers a pay-for-performance re- 
ward system because it provides an objective 
means for indicating one’s level of compe- 
tence. In other words, the monetary reward 
serves as a symbol of achievement. 

As additional evidence for the hypotheses, 
Litwin (1958) found that when individuals 
were asked to set the amount of reward which 
should be granted for different levels of 
achievement in a game of skill, the gradient 
of the reward curve set by high scorers on 
n Achievement was significantly steeper than 
that set by low scorers on this measure. That 
is, persons high in achievement motivation 
felt that the rate of pay for accomplishing 
more difficult tasks should increase more 
rapidly than did the subjects who scored low. 

Evidence which would lead to the third hy- 
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Herbert H. Meyer and William B. Walker 


pothesis, which deals with the reaction to per- 
formance appraisal feedback, is provided in a 
study by French (1958). In this study it was 
found that task-relevant feedback information 
given to subjects working on problems was 
significantly more effective in improving per- 
formance for high need achievers than it was 
for those low in n Achievement. In another 
experiment by French (1955) it was also 
shown that the degree to which improvement 
was shown in performance after feedback of 
results information to subjects working on a 
coding test was correlated with a measure of 
achievement motivation. 


METHOD 


The same subjects and predictor variables used in 
the first phase of the study reported previously 
(Meyer et al., 1961) were used in this second phase 
of the study. Thirty-one managers in manufacturing 
components and 31 specialists in staff-type jobs, 
matched with the managers for age, education, po- 
sition level, and length of service, completed a short 
Risk Preference Questionnaire and wrote brief stories 
to six thematic apperception pictures scored for n 
Achievement, n Power, and n Affiliation (see Atkin- 
son, 1957, Recommended Multiple Purpose Set A, 
Appendix III). 

Each subject was also interviewed to obtain an 
indication of his attitudes toward and reactions to 
salary plan variations and the performance appraisal 
program. Specifically, the interviewers rated the de- 
gree to which each participant’s attitudes were favor- 
able or unfavorable toward: 

1. A Merit Pay Plan: the type of salary plan pres- 
ently used by the General Electric Company for pro- 
fessional or managerial personnel in which pay level, 
within a broad range established for the position, is 
based on performance—i.e., the manager appraises 
performance and establishes rewards in direct pro- 
portion to perceived excellence of accomplishment. 

2. A Scheduled Increases Plan: the type of plan 
typically used for lower level salaried positions, 
where increases are automatic on a scheduled basis. 
Outstanding performance can generally be rewarded 
only by promotion. 

3. The Performance Appraisal Program: the peri- 
odic feedback of the boss’s appraisal of performance 
results to the individual. 

The interviewee was also asked to describe in some 
detail his experience with the performance appraisal 
program and, especially, his reaction to the last ap- 
praisal discussion which his manager had held with 
him. The interviewer probed specifically for any in- 
dication that the man had taken some action to im- 
prove his performance based on the performance ap- 
praisal discussion. The man’s response was coded 
positively if he could cite some specific action he 
took based on items discussed or suggestions made 


during the interview. He might have indicated, for 
example, that he had enrolled in a Human Relations 
training course at the suggestion of his supervisor, or 
that he had reorganized his method for keeping scrap 
records so that he could get an earlier indication of 
needed corrections, or that he had made a special 
effort to get reports in on time since this was men- 
tioned by the manager as an item which needed im- 
provement. 

During the interview the subject was also pre- 
sented with a list of factors which might be consid- 
ered in determining a man’s pay, such as “Age,” 
“Length of Experience,” “Impact of individual’s con- 
tribution on the success of the component,” and 
“Status or level of individuals who must be con- 
tacted (in or out of the company).” He was asked 
to rank these in order of the importance that he felt 
should be given them in determining a man’s pay. 
It was predicted that persons high in achievement 
motivation would rank Impact of Contribution high, 
and Age and Length of Experience low in impor- 
tance. It was also hypothesized that persons scoring 
high in n Power would rank Status of Contacts high 
in importance as a factor determining pay. 

These interview variables, dealing with reactions to 
reward systems and performance appraisal, were cor- 
related with motive measures and other variables in 
the analysis for this phase of the study. 


RESULTS 


Table 1 presents correlations between the 
different predictor and control measures and 
the participants’ attitudes toward salary plan 
variables.” 

Considering first the motive scores, it can 
be seen that n Achievement showed no signifi- 
cant relationships to any of the attitudinal 
variables. Need for Power, on the other hand, 
was found to be positively related to rankings 
of the factor Status of Contacts for the Man- 
agers and for the total group, as hypothesized, 
and shows some correlations which approach 
significance in the direction hypothesized for 
n Achievement. 


?For this analysis the distributions of ratings of 
salary plan alternatives and rankings of factors de- 
termining pay were dichotomized as near as possible 
to the median in each case. Therefore, all correla- 
tions are biserials except for. those with the Risk 
Preference Questionnaire, which are tetrachorics. In 
computing significance levels of these coefficients, 
one-tailed tests were used in those cases where the 
direction of the relationship was clearly predicted. 

For the Risk Preference Questionnaire a high score 
indicates a preference for intermediate level risks. 

Age and Length of Experience as factors determin- 
ing pay were combined for this analysis, since ranks 
assigned these factors by the participants were highly 
correlated. 
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Preference for intermediate risks, as an in- 
dex of achievement motivation, shows several 
correlations in the directions hypothesized, 
although the tendencies are somewhat incon- 
sistent. This measures correlates positively, 
for example, with attitudes toward merit pay 
as predicted, in the case of the Managers, but 
the same correlation is only very low for the 
Specialists. Attitudes toward the performance 
appraisal program, on the other hand, are sig- 
nificantly correlated with risk preferences, as 
predicted, for the Specialists, but not corre- 
lated with these attitudes for the Managers. 
Risk preference also correlates in the expected 
directions with rankings, by the Manager 
group, of Age and Experience and Impact of 
Contribution as important in determining pay. 
Low correlations in the opposite direction 
were found for the Specialists. 

Consistent results for the two groups are 
found only for correlations between (a) risk 
preferences and the subjects’ reports of 
whether or not constructive action was taken 
to improve on-the-job performance based on 
their last performance appraisal discussions, 
and (0) this latter variable and age.* Other 
correlations, for example those between age, 
education, and attitudinal variables, are in 
the expected direction for one of the groups 
but are either near zero or in the opposite di- 
rection for the other group.‘ 


DISCUSSION 


The results of this exploratory study were 
inconsistent. Some of the correlations between 
the motivation or risk preference measures 
and attitudes toward alternative reward sys- 

3 Of the 62 participants in this study, 49 (23 of the 
Managers and 26 of the Specialists) had had per- 
formance appraisal discussions with their managers 
within the last year or two, which they could de- 
scribe in some detail. Of the 49, 21 (12 of the Man- 
agers and 9 of the Specialists) reported that they had 
taken some specific constructive action to improve 
performance, based on suggestions made or topics 
discussed in the feedback interview. 

4Intercorrelations for the predictor and control 
variables (presented in the columns of Table 1) were 
given in the report of the first phase of this study 
(Meyer et al., 1961). Risk preference, which shows 
the greatest number of significant correlations with 
dependent variables in Table 1, was not found to be 
significantly correlated with any of the other predic- 
tor or control variables. 


Herbert H. Meyer and William B. Walker 


tems and reactions to the performance ap- 
praisal program were significant in directions 
predicted. Other expected correlations were 
not found. It was encouraging to note, how- 
ever, that the statistically significant correla- 
tions were in the directions predicted. 

If the occurrence of correlations between 
predictor and criterion variables in directions 
expected is considered as a “backward valida- 
tion” of predictor variables, it would appear 
that the Risk Preference Questionnaire is bet- 
ter than scores on the thematic apperceptive 
measure of n Achievement as an indicator of 
those aspects of achievement motivation on 
which the hypotheses for this study were 
based. All of the hypotheses that were con- 
firmed involved the Risk Preference Ques- 
tionnaire rather than the thematic appercep- 
tion scores. This finding might be explained 
by the possibility that the n Achievement 
score is not a sufficiently reliable measure to 
expect consistent correlations to appear in 
groups as small as those employed in this 
study. It is also possible, of course, that n 
Achievement is a variable which has no va- 
lidity in this situation. 

The fact that the Risk Preference Question- 
naire did correlate with other variables as pre- 
dicted does not, of course, indicate with any 
certainty that it is measuring “achievement 
motivation” of the same type as is appraised 
by thematic apperception. Other causes could 
possibly account for the correlations found. 
It may be, for example, that the type of risk 
preferences shown indicates a degree of nega- 
tive, fear-of-failure motivation, as suggested 
by Meyer, Walker, and Litwin in a report of 
the first phase of this study. 

The critical behavior measured by this 
questionnaire may not be preference for in- 
termediate-level odds in a risk situation, as 
would be expected according to the motiva- 
tion theory on which the study was based. 
While the Risk Preference Questionnaire was 
scored by including preference for either long 
or short odds in one category and preference 
for intermediate odds in the other, a closer 
inspection of the data revealed that few per- 
sons showed preference for long-odds alterna- 
tives. The significant correlations found ap- 
peared to be generated by the subjects who 
chose short odds or the “safer bet” alterna- 
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tives. In fact, in the correlational analyses, if 
the few who expressed preference for long 
odds were included with the subjects who 
preferred intermediate odds, the significant 
correlations found were generally increased 
rather than decreased. Thus, perhaps prefer- 
ence for safe bets is the critical variable, in- 
dicating a need for security or fear-of-failure 
motivation. This may or may not be indica- 
tive of Jow need for achievement. In the re- 
lated report mentioned above, evidence was 
cited to support the possibility that fear of 
failure is more likely to be associated with 
a moderate level of achievement motivation 
than with low need for achievement. 

The high correlation found between risk 
preference behavior and the subjects’ reports 
of whether or not they took constructive 
action based on the performance appraisal 
discussion provides additional evidence to 
support the hypothesis that this measure is 
assessing some aspect of achievement motiva- 
tion. As was mentioned above, however, it is 
not necessary to interpret this result as indi- 
cating only that the aspect of motivation 
measured is a positive, success oriented type. 
Again, the results can be explained just as 
well by assuming that the critical behavior 
tapped by this measure is the preference for 
safe bets, which may indicate a negative, fear- 
of-failure type of motivation. 

The fact that an individual took action 
based on the appraisal feedback discussion 
must have meant that the manager discussed 
an area of needed improvement. This might 
be interpreted by the subject as a failure on 
his part in some aspect of job performance. 
Atkinson (1957) explains that, according to 
his motivation theory model, the effect of fail- 
ure on the person with a high level of anxiety 
about failing is to decrease his motivation 
and cause him to avoid the situation. He also 
cites research evidence to support this theory. 
This explanation could account for the fact 
that the men who showed preference for the 
safe bets in this study were also found to be 
less likely to take constructive action based 
on the performance appraisal. 

If we ignore fear-of-failure motivation in 
explaining the results, it is possible to make a 
“ood case for the expectation that persons low 
in achievement motivation would take con- 
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structive action to improve performance based 
on appraisal feedback, if the pay-for-perform- 
ance salary plan were working well. According 
to motivation theory and research evidence, 
money for itself is not a primary incentive for 
the person who is high in n Achievement. For 
that person, money is an effective incentive 
only to the extent that it provides an objec- 
tive symbol of success. If other symbols 
equally positive and objectively associated 
with success were available, they should be 
just as effective as incentives for the high 
need achievers. 

For the person low in n Achievement, on 
the other hand, the monetary reward may 
provide the incentive needed to motivate im- 
proved performance on his part. McClelland 
(1961) summarizes research evidence to show 
that money as such is not an effective incen- 
tive for the person high in need for achieve- 
ment, but very effective for the person low in 
this need. In two studies by Atkinson (1958, 
Ch. 19-20), for example, it was found that 
the addition of monetary rewards for success- 
ful completion of tasks improved the perform- 
ance of persons low in n Achievement to the 
extent that they wiped out significant differ- 
ences in performance in favor of the high need 
achievers which had been found when no such 
rewards were offered. 

On the basis of this evidence, one would 
predict that, if the persons low in n Achieve- 
ment expected monetary rewards to be based 
directly on excellence of performance, they 
would be likely to take action to improve per- 
formance. The fact that these persons were 
found to be significantly less likely to take 
such action could be explained very well by 
the rationale presented above if the additional 
assumption were made that the merit pay 
plan was actually not being administered ac- 
cording to the theory on which such a plan is 
based. The interviews provided some evidence 
to show that this may have been true. 


SUMMARY 


This exploratory study was designed to test 


the hypotheses that persons high in achieve- 
ment motivation would (a) prefer a salary 
plan based on a pay-for-performance philoso- 
phy, (4) have a favorable attitude toward the 
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periodic appraisal of performance and feed- 
back of this appraisal to the individual, and 
(c) be likely to take action to improve per- 
formance based on the feedback of appraisal 
data. The subjects used to test these hypothe- 
ses consisted of 31 Managers and 31 Spe- 
cialists in the manufacturing sections of four 
departments. 

If “need for Achievement” scores on a the- 
matic apperception measure are used as the 
measure of achievement motivation, the re- 
sults of the study would have to be consid- 
ered as negative. However, if risk preference 
behavior, which had been found in previous 
research to have predictive validity as a meas- 
ure of achievement motivation, is used as the 
dependent variable, the results could be in- 
terpreted as largely positive. This measure 
was found to be correlated with reactions to 
salary plan variables mentioned above in 
enough of the relationships explored to indi- 
cate that more definitive studies, employing 
larger groups and an improved measure of 
risk-taking behavior, might provide more defi- 
nite confirmation of the hypotheses consid- 
ered. 


Herbert H. Meyer and William B. Walker 
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The Administrative Judgment Test (AJT) 
was developed after World War II by the 
United States Civil Service Commission. The 
test “attempts to measure broad understand- 
ing of the processes of administration 
whether government or private” (Mandell, 
1950, p. 145). Mandell has demonstrated its 
ability to predict administrative success, as 
evidenced by performance ratings and grade 
level (1950, 1956). This study endeavors to 
explore which intellectual facets of the exer- 
cise of judgment within the decision making 
process are related to over-all performance on 
the AJT. 

The AJT is in multiple-choice form; the 
55-item test No. 600, from the commission’s 
Series No. la, was used in this study. The 
items 
include problems in the relationships between the 
headquarters and field offices in an organization, and 
those between research and operating personnel. They 
also include problems on the timing of programs and 
the organization of the office of an administrator 
The test not attempt to measure technical 
knowledge in such fields as personnel or budgeting 
or accounting (Mandell, 1950). 


does 


Forty rating scales were used by superiors 
and peers to describe decision making ca- 
pabilities and styles in the rendering of ad- 
ministrative judgments. The final or fortieth 
scale was global in nature, asking “In gen- 
eral, how effectively do you feel the executive 


1 This work was supported by a grant from the 
Public Affairs Division of the Ford Foundation to the 
Center for Programs in Government Administration 
of University College of the University 
Authorization to use the Administrative Judgment 
Test for research purposes O. Glenn 
Stahl, Director of the Bureau of Program and Stand 
ards of the United States Civil Service Commission 
Guidance and expedition in its use under 
conditions were given by Albert Maslow 
Mandell in Washington, and Joseph O'Connor, 
George Rosenthal, C. S. Littenberg, and Quintin 
Guerin in the Chicago regional area. 
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exercises judgment in his decision-making?” 
Cluster analyses of the preceding 39 items 
were made separately for superior and peer 
ratings in a previous analysis (Forehand & 
Guetzkow, in press). 

By correlating the AJT with the items and 
cluster combinations used by superiors and 
peers in describing executive judgments the 
following attempt is made to obtain further 
understanding of the content of this test. 


PROCEDURE 


Sub jec One hundred and persons 
holding administrative positions in agencies of the 
United States government served as subjects in the 
study. The executives represented their 
civil service grade levels ranged from 9 to 17, with 
a median of 13.2. 

Ratings. The superiors’ ratings were made by each 
subject’s immediate organizational superior; the peer 
ratings were made by a co-worker, selected by the 
superior one who worked closely » ith the sub- 
ject. The items are described in Tab! 1; the source 
and rationale of the variables are described elsewhere 
(Forehand & Guetzkow, in press) 

Analysis. Product-moment correlation coefficients 
were used to assess the relationship between total 
scores on the AJT and individual items, and _be- 
tween total score and sets of items combined to de 
fine clusters based upon those obtained earlier from 
the superior and peer ratings. These latter combina- 
tions are called “cluster-combinations” in this paper 

Because the specific ratings of the performance 
items shared in an over-all “halo effect” (Forehand 
& Guetzkow, in press), adjusted scores were defined 
for each specific rating. The adjusted score consisted 
of the difference between the rating and the pre- 
dicted value of the rating based upon its regression 
with the rating of Item 40, “general effectiveness in 
exercising judgment.” The correlations between test 
cores and the adjusted ratings were determined alge 
braically by computing the part correlations of test 
scores with a given rating adjusted for its relation- 
ship with the general rating (DuBois, 1957, pp. 60 
61). An adjusted rating was defined for the superior 
and peer ratings separately, and for the sum of the 
two 


twenty-seven 


7 agencies; 


as 


The particular items used in the cluster-combina 
tions are indicated in Table 2. Both raw scores and 
adjusted scores on these combinations were corre- 
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ADMINISTRATIVE JUDGMENT TEST 


Item 
; No. Description of Item 
1. Decides effectively when appropriate pre- 


~ 


16. 


we 
~ 


| 


cedent is available 


. Documents decisions carefully for review by 


others 


. Peforms well when bases for decision are 


clear and definite 


. Performs well even when bases for decision 


are vague and ambiguous 


. Performs well in making routine decisions 
. Decides well when relevant precedent is 


lacking 


. Assumes responsibility completely when 


decisions are to be postaudited 
Makes critical or highly important decisions 
adequately 


. Considers all relevant information 
. Makes simple, straightforward decisions 


very satisfactorily 


. Competently makes decisions even when 


facets of decision must be concealed 
Formulates decisions capable of being given 
adequate public defense 


. Skilled in considering goals in his judgment 


making 


4. Works well under heavy decision pressure 
5. Deals effectively with decisions involving 


staff functions (i.e., budget, personnel) 
Deals effectively with decisions involving 
general administration matters 
Effectively obtains group consensus on de- 
cisions 


. Decides well in situations with some policy 


implications 


. Competence in ‘policy implementing” as 


contrasted with “policy making” decisions 
Embodies technical know-how in judgments 


. Considers implicit, hidden aspects of situ- 


ation 
Refrains from decisions when appropriate 


. Avoids over-commitment and retains flexi- 


bility 
Screens facts for relevance and accuracy 
Keeps details in perspective 
Takes initiative when appropriate 
Keeps within scope of given authority 


. Embodies checks of adequacy in decisions 


=.17 significant at .05 level. 


** ry =.23 significant at .01 level, 


Superior Peer Combined 
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TABLE 1 (Continued) 


Item - 
Superior Peer Combined 


No. Description of Item 


. Makes full use of given authority 
. Makes judgments which are internally con- 
sistent 
. Decisions contain clues for execution 
. Feeds-back past results in reshaping ob- 
jec tives 
33. Considers wide range of alternatives 
Focuses attention on definition of the 
problem 
. Sticks to objective realities, avoiding wish 
fulness 
. Works in an orderly, systematic fashion 
37. Times decisions appropriately 
Tackles decisions with assurance and self- 
confidence 
. Decisions are as appropriate in long-run as 


in short-run 


lated with the over-all score on the AJT. Scores on 
the cluster-combinations and the combined superior 
and co-worker ratings are the sums of the original 
variables in standard score form. The correlations 
involving both sets of scores were computed by 
means of the formula for the correlation of sums of 
standard scores. 


RESULTS 


The correlations of AJT performance with 
both the raw and adjusted ratings are pre- 
sented in Table 1. Correlations exceeding .174 
are significant at the 5% level; correlations 
exceeding .228 are significant at the 1°% level. 

The correlation of the global “over-all per- 
formance” rating (Item 40) with AJT is .28 
for superiors and .17 for peers. It will be 
noted that 26 or 67% of the 39 original rat- 
ings by superiors have correlations with the 
AJT which are significantly greater than zero, 
while only 11 or 28% of the raw ratings made 
by peers are significantly related to the AJT. 
Most of these relationships may be accounted 
for by the correlation of the specific ratings 
with the over-all rating of effectiveness. When 
the ratings are adjusted, only six or 15‘% of 
the ratings by superiors and four or 10% of 
the ratings by co-workers are significantly re- 
lated to AJT performance. 

Two adjusted item-ratings from both su- 
periors and peers are significantly related to 


Original Ratings Adjusted Ratings 


Superior Peer Combined 


12 06 03 — 03 
16 30** 


.18* 


07 
a5 09 


15 
—.01 


16 a 5 
16 “a 


16 


03 aa” 
O8 20* 


AJT performance: performance in situations 
with policy making implications (Item 18) 
and screening facts for relevance and accu- 
racy (Item 24). In addition, four superior 
ratings and two peer ratings are related to 
test performance. The difference between su- 
periors and peers is interpretable as the dif- 
ference between a hierarchical and collateral 
view of decision making. Superiors see their 
more adequate subordinates, as appraised by 
AJT, as those who operate in terms of the 
demands of the decision making hierarchy, 
making their decisions internally consistent 
(Item 30), working successfully with implicit, 
concealed facets of the problem situation 
(Items 11 and 21), and timing their decisions 
appropriately vis-a-vis the context of their 
situations (Item 37). Peers, on the other 
hand, view their more adequate co-workers, 
as appraised by AJT, as those who implement 
rather than make policy (Item 19) and those 
who avoid decision when it is prudent to do 
so (Item 22). 

By utilizing the results of the previous 
cluster analysis of the ratings (Forehand & 
Guetzkow, in press), complementary but 
somewhat different conclusions emerge. As in- 
dicated above, cluster-combination scores were 
developed for superiors’ and co-workers’ rat- 
ings with the items listed in Table 2. Some of 
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TABLE 2 
CORRELATIONS OF CLUSTER-COMBINATION RATINGS OF SUPERIORS AND PEERS WITH TOTAL SCORES ON THE 
ADMINISTRATIVE JUDGMENT TEST 
(N=127) 


Descriptions of Cluster-Combinations ———— 
(with item numbers) Superior 


Self-Confidence 

(Items 26, 29, 38) 
Cautiousness 

(Items 22, 27, 39) 
Discernment 

(Items 21, 24, 28) 
Analytic Problem-Solving 

(Items 9, 32-37) 
Bureaucratic 

(Items 1, 3, 7, 10, 11) 
Policy Applying 

(Items 6, 12, 13, 18) 
Policy Execution 

(Items 14, 15, 16, 19, 20, 25, 30) 


*y =.17 significant at .05 level. 
** y =.23 significant at .01 level. 


the cluster-combinations seem to characterize 
the personal and intellectual style of the ex- 
ecutive as he makes judgments: Self-Confi- 
dence, Cautiousness, Discernment (describing 
the tendency to penetrate into the less obvi- 
ous aspects of administrative problems), and 
Analytic Decision Making Capability. The re- 
maining cluster-combinations appear to cen- 
ter upon characteristics of the decision-mak- 
ing situation: Bureaucratic Decision Making 
Capability (characterized by reliance on rules, 
precedents and policies), Policy Applying 
Abilities (utilizing policies as over-all guides 
to decision), and Pplicy Executing Abilities 
(describing skills int the direct implementa- 
tion of policy in specific situations). 

Table 2 presents the product-moment cor- 
relations of these cluster-combination scores 
for superiors and peers, both original and ad- 
justed. Once halo is removed, the superiors 
from their hierarchical perspective see their 
high scoring subordinates as cautious, discern- 
ing, and analytic. The peers, observing later- 
ally, see their high scoring co-workers in terms 
of capability in applying agency policy. It is 
interesting to note that in neither the origi- 
nal nor adjusted ratings for neither superiors 
nor peers is the self-confidence displayed in 


Original Ratings 


Peer 


Adjusted Ratings 


Combined 


Combined Superior Peer 


judgment making by the executive related to 
performance as evaluated by the AJT. 
DiIscUssION 

Mandell (1950) compared the AJT with 
job performance, as measured by collective 
ratings of colleagues and superiors and with 
grade level in several samples. He reported 
correlations ranging from .50 to .68 with job 
performance, and from .28 to .56 with grade 
level. He further reported that tests of gen- 
eral mental ability had validities ‘“substan- 
tially lower, in general, than . [those of] 
the administrative-judgment test,” despite the 
fact that the mental abilities test had correla- 
tions in the .60s with the AJT. The following 
observations are presented by way of com- 
paring the results of the present study with 
those reported by Mandell. 

A “job performance” criterion comparable 
to Mandell’s “collective ratings of colleagues 
and superiors” was defined as the sum of the 
total scores on the 39 specific ratings given by 
superiors and peers. Total scores on the AJT 
have a correlation of .53 (N = 127, p< .01) 
with this criterion and of 32 (N = 186, p 
< .01) with grade level. Results obtained by 
combining superiors’ and peers’ item ratings 


2 
260 
tay 
15 18* —.06 05 ~02 
.30** .20* 14 .18* 
.39** .21* .37** .33** 22 .25** 
34** 19* 19* 10 16 
.25** 15 .23* 07 06 — .04 
31** .25** .34** 14 .19* .19* 
26** 19* .27** 09 10 At 
ah 


Executive Judgment Behaviors 


and cluster-combination scores are presented 
in Tables 1 and 2, respectively. A test of men- 
tal ability, the Thurstone Test of Mental 
Alertness (Science Research Associates, 1952), 
has a correlation with the composite rating of 
22 (N = 48, p > .05) and a correlation of 
.O5 with grade level (V = 75, p > .05). The 
correlation between the Test of Mental Alert- 
ness and the AJT is .58 (N = 80, p < .01). 
These results closely parallel those reported 
by Mandell. It should be noted that the sam- 
ple employed in this study was heterogeneous 
with respect to agency, while those studied by 
Mandell were relatively homogeneous. 

Mandell (1950) states that the coverage of 
the AJT is broad and general. The fact that 
so many of the original ratings by superiors 
(67%) correlate with the over-all AJT score 
substantiates this position. The fact that the 
item and cluster-combination scores tend to 
correlate so poorly, once adjusted for halo, 
again substantiates Mandell’s belief that the 
AJT is an over-all test without pockets of 
specificity. The fact that both the halo rating 
and the cluster-combination by the superiors 
were better correlated with AJT than the 
corresponding measures by the peers further 
evidences the hierarchical perspective of the 
AJT. Its strongest relations were with items 
concerned with intellectual processes. 

It should be emphasized that this study is 
concerned with only a partial criterion of the 
AJT’s validity: that of cognitive and intel- 
lectual factors in executive judgment. This 
study should be viewed, therefore, not as an 
attempt to validate again the AJT as an over- 
all instrument, but rather as an analysis of 
some factors which contribute to its general 
validity. This approach is consonant with a 
strategy of building knowledge about execu- 
tive performance by studying segments of the 
total problem (Guetzkow & Forehand, in 
press). A more adequate understanding of the 
instrument would entail studies of additional 
segments, such as social characteristics in- 
volved in human relations, motivational ap- 
praisals, and organizational characteristics 
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broader than those of the immediate decision 
situation. 
SUMMARY 


Relationships between the Administrative 
Judgment Test of the United States Civil 
Service Commission and ratings by superiors 
and peers of aspects of executive judgment 
were studied in a group of 127 federal ad- 
ministrators. A wide variety of rated charac- 
teristics correlated significantly with the test. 
When the ratings were adjusted to correct for 
the influence of the rater’s general impression 
of effectiveness in making judgments, both 
superiors and peers described executives who 
scored highly on the test as competent in 
making decisions with policy making impli- 
cations and in screening factual information 
for relevance and accuracy. Superiors, in ad- 
dition, described high scoring executives as 
making decisions which are internally consist- 
ent, working successfully with implicit, con- 
cealed facets of the problem situation, and 
timing their decisions appropriately. Peers, on 
the other hand, viewed their more adequate 
co-workers, as appraised by the test, as those 
capable of implementing rather than making 
policy, and those who know when it is prudent 
to avoid decision. 
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COMPARISON OF SEVERAL STYLES OF TYPOGRAPHY 
IN ENGLISH ' 


EDMUND B. COLEMAN anp INSUP KIM 


Johns Hopkins University 


When reading a sentence, we perceive not 
just a single word at each fixation, but groups 
of three or four words. Similarly when under- 
standing a sentence, we do not understand it 
as a linear string of discrete words: we or- 
ganize the words into meaningful phrases, or- 
ganize these phrases into clauses, and so on. 
Stress, pitch, and juncture help organize the 
words into the correct phrases in speech. Re- 
cently, at least three typographies have been 
proposed that help organize the words cor- 
rectly in reading. 

In several books (1943) 

Lillian Lieber 

attempted to aid understanding 
by printing 

only a single phrase 

on each line. 


Andrews (1949) proposed a second style 
called “square span.” 


is arranged 
in double-line blocks. 


In square span 
the material 
North and Jenkins (1951) argued that perhaps 
the main advantage of square span lay in its 
grouping of words into “thought units.”’ They pro- 
posed a third style called spaced unit in which 


the words are grouped into thought units by 
spaces. 


In all three of the above styles there are 
cues that help the reader cluster the words 
into the supposedly correct groups. 

The square span has one other advantage 
over spaced unit (indeed this is the source of 
its rather puzzling name): it utilizes vertical 
eyespan as well as horizontal. But square span 
does not seem to be the arrangement that 
maximally exploits vertical eyespan. It would 
seem to be maximally exploited by a vertical 
arrangement. 

The vertical style below concentrates more 
words within the eyespan, even if we assume 


1 Part of this investigation was carried out during 
the tenure of a National Science Foundation predoc- 
toral fellowship, #30125. 


the effective span to contract as more relevant 
words are concentrated within it. 


In the 
vertical 
style 
the 
fixations 
would 
overlap 
and 
maximally 
exploit 
peripheral 
vision. 


So far, previous attempts (Andrews, 1949; 
Klare, Nichols, & Shuford, 1957; Nahinsky, 
1956; North & Jenkins, 1951) to find a 
more efficient style of typography have shown 
conflicting results. North and Jenkins (1951), 
using long passages, found spaced unit su- 
perior in both reading speed and comprehen- 
sion. But Klare et al. (1957), using a similar 
procedure, did not find a significant difference 
between spaced and conventional. 

With oriental languages which traditionally 
use vertical arrangements, Chang (1942) and 
Sato (1958) found little difference between 
the vertical and horizontal arrangements. 
Sato and Kusajima (1958) found that spac- 
ing into “meaningful units” increased read- 
ing speed in Japanese. 

In evaluating these conflicting results, we 
should note that there are many possible ar- 
rangements within each experimental style. 
Perhaps some of the styles were found to be 
inferior to conventional simply because the 
experimenters selected an inefficient arrange- 
ment of the style. In this series of experi- 
ments, we will attempt to select the most effi- 
cient arrangements within each style. 

On the other hand, perhaps some of the 
styles were found inferior because reading 
habits were interfering with the new arrange- 
ments. In this experiment, the styles will be 
tested under two conditions: the traditional 
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reading situation and a tachistoscope pres- 
entation. Because of novelty, in the tachisto- 
scope presentation, reading habits should be a 
relatively less important factor than in read- 
ing longer selections. 

This study will also consider two styles that 
have not been previously investigated: verti- 
cal and Lieber’s arrangement 
phrase to a line. 


PROCEDURE 


This paper reports two fairly independent series of 
experiments: in four of them, the subjects read long 
passages of about 1500 words; the other four used 
a tachistoscope presentation.2 Undergraduates from 
Johns Hopkins were used as subjects in both series 
Altogether 267 undergraduates were used in eight in- 
dependent experiments. 

Reading Series. The subject first read the instruc- 
tions which were typed in the style to be investi- 
gated and thus familiarized himself with the mate- 
rials as well as the procedure of the experiment. For 
warm-up, he read a practice passage of about 200 
words and took a test on it. Then, under a time 
limit, he read several experimental passages, each of 
about 1500 words typed on standard 84” X 11” pa- 
per; and immediately after finishing each passage he 
took an objective test on it. He was scored on words 
read per minute and number of questions answered. 

All the reading experiments use a Lindquist (1953, 
p. 289) Type V design. To administer this design in 
reading experiments for three different styles: one se- 
lects three passages, prepares each passage in all three 
styles, and casts these nine preparations into three 
different graeco-latin squares in which the different 
passages make up the latin factor and order of pres- 
entation is the graeco factor. The subjects are di- 
vided into 3 groups (see Footnote 2), and a subject 
reads one passage in each of the experimental styles. 

Tachistoscope Experiment. The material consisted 
of 72 sentences of three different lengths—24, 16, and 
8 words—which were arbitrarily selected from sev- 
eral nontechnical books. These sentences were equal- 
ized in the number of words and syllables. They were 
typewritten, one sentence on one white card. 

The sentences of 24, 16, or 8 words were exposed 
in a Gerbrands tachistoscope for the following dura- 
tions: 4”, 2.5”, or 1”. The subject came to the ex- 
perimental room and read the instructions (written 
in the style to be investigated), and was thus fa- 
miliarized with both the procedures and the mate- 
rials. The subject was then given four or five prac- 
tice sentences in the tachistoscope. He read the ex- 
posed sentence, and as soon as the light in the 
tachistoscope went out, told the experimenter what 
he had read. The subject was scored for the number 
of correct words he reproduced—regardless of order. 
Synonyms were not counted as correct. 


2 The reading experiments were conducted by Cole- 
man, and the tachistoscope experiments by Kim. 


of a single, 
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All the tachistoscope experiments used a Lindquist 
(1953, p. 273) Type II design. To administer this de 
sign with our material for three different treatments 
one divides his subjects into three groups, divides the 
72 sentences into three sets of approximately equal 
difficulty, and types each of the 72 
three treatments. A subject reads one set of sentences 
in each experimental style. 


sentences in all 


RESULTS 
Reading Experiment on Spaced Style 


Sixty-four subjects were used in this experi- 
ment which compared conventional to three 
variations of spaced style. Four passages of 
1500 words each were selected from Wood- 
worth (1938), each passage was typed in all 
four styles, and a 25-item test was prepared 
and mimeographed for each passage. 

All the variations in this experiment use 
only two spaces between units. Furthermore, 
in all variations, two spaces were used after 
commas and three spaces were used after pe- 
riods. The three variations of the spaced style 
were: (a) Clauses were separated by an extra 
space. In addition to spacing between clauses, 
phrases that modified a clause as a whole were 
separated by an extra space. The units in this 


TABLE 1 
MEAN ScorES ON PASSAGES FOR 
READING EXPERIMENTS 


Words Questions 
Read Answered 
per per 


Style Minute Subject 


Vertical Style 


Conventional 

8 letters, spaced into phrases 

8 letters, unspaced 

1 word, spaced 
One Phrase to a Line 

Conventional 

Long phrases to a line 

Short phrases to a line 14.7 
Spaced Style* 

Conventional 260 13. 

Space between clauses 261.2 13.. 

Space between grammatical 

units 261 138 


Space between phrases 261. 14.0 


*In this experiment, subjects read more difficult selections. 
Scores were multiplied by constants (260/204 and 13.2/10.3) 
to make all styles roughly comparable. 
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variation average 7.25 words. (b) Grammati- 
cal units—the subject plus its modifiers, verb 
plus its modifiers, and object plus its modi- 
fiers—were separated from one another when- 
ever any two such adjacent units totaled more 
than five words. This was in addition to the 
separations above. The units in this variation 
averaged 4.72 words. (c) Phrases—preposi- 
tional, infinitive, and participial—were sepa- 
rated. The units in this variation averaged 
3.35 words. 

The results reported in Table 1 are disap- 
pointing. The overall difference between styles 
is not significant using analysis of variance 
with a pooled error term. Even the ¢ ratio be- 
tween conventional and the best version of 
spaced style is not significant. 


Reading Experiment on Vertical Style 


The first experiment used 32 subjects and 
conventional was compared to three varia- 
tions of the vertical style. Four passages of 
1680 words each were selected from de Kruif 
(1932), each selection was typed in all four 
styles, and a 20-question multiple choice test 
was prepared for each selection. Two words 
were grouped on the same line only if they 
totaled less than eight spaces. The three varia- 
tions of the vertical square span were: (a) 
Two words grouped together on the same line 
only if they totaled less than eight spaces, 
and the sentences were not spaced into thought 
units. (6) Two words grouped together on the 
same line only if they totaled less than eight 
spaces, and the sentences were spaced into 
thought units. These thought units averaged 
4.8 words to a unit. (c) Only one word to a 
line and spaced into the same thought units 
as in b, 

Table 1 shows that the results are again dis- 
appointing. If we use reading speed as the cri- 
terion, a sign test is adequate to show that 
conventional is read significantly faster than 
all three vertical styles. Ignoring ties, the 
number of subjects who read conventional 
faster is 25 to 5, 24 to 6, and 26 to 1. These 
ratios are all significant beyond the .01 level. 

If we use comprehension as the measure, 
all the differences still favor the conventional 
style, but none are significant. In other words, 
even though untrained subjects do read verti- 
cal style slower, they must read it more ac- 
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curately (Accuracy = number of questions an- 
swered per word read). With difficult mate- 
rial, some may argue that accuracy is more 
important than speed. Therefore a second ex- 
periment was designed in which subjects were 
given ample time to finish each passage. 

In the second experiment two 232-word se- 
lections from Geldard (1953) and two of the 
previous selections from de Kruif were used. 
Only the most promising of the vertical styles, 
eight letters to a line and spaced, was com- 
pared with conventional. Forty subjects were 
given 2 minutes to read the short selections 
and 9 minutes to read the long ones. This was 
ample time for all to finish, and most of them 
glanced over the material a second time. The 
total results are 828 questions answered out 
of a possible 1120 for the conventional style, 
816 for the vertical style. This difference, of 
course, is not significant. In terms of reading 
for content as opposed to reading for speed, 
there does not seem to be much difference in 
favor of conventional even for unpracticed 
subjects. 


Results of Lieber’s Arrangement 


Only 18 subjects were available for this 
study—it is hardly more than a pilot study. 
The materials were three of the previously 
described selections from de Kruif. Each was 
typed in three styles: conventional, a version 
of rather short units that averaged 4.3 words 
on each line, and a version in which some of 
the previous units were combined into longer 
ones so that this version averaged 6.1 words 
to a line. The results reported in Table 1 are 
not significant. 


Results for Tachistoscope Series 


Three preliminary experiments were made 
to select the best arrangement within three 
styles: vertical, spaced, and square span. 
Table 2 shows the arrangements that were 
tested. Analysis of variance indicated that 
there was no significant difference among the 
variations within each style. Nevertheless, 
within any style the arrangement with the 
largest total score was selected for a final 
overall comparison of styles. 

In the fourth experiment, the selected ar- 
rangements were compared with conventional. 
The data were analyzed in terms of the num- 
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TABLE 2 
ARRANGEMENTS TESTED IN PRELIMINARY 
EXPERIMENTS WITH TACHISTOSCOPE 


Arrangements in Vertical Style 
No. of Words No. of Words Total Score 
ina Unit to a Line No. of Words 


4.8 


3883.7 


Arrangements in Spaced Style 
No. of Words 


in a Unit 


Total Score 
No. of Words 


No. of Spaces 
between Units 


3905.1 
3920. 

4004.1 
4010.78 


Arrangements in Square Span! 


No. of Words 
in a Unit 


Total Score 


No. of Words 


No. of Lines 
.ina Unit 


* Selected for overall comparison of styles. 
>For square span, only 21 subjects and 24-word sentences 
were used. 


ber of words correctly reproduced. The aver- 
age number of words reproduced per subject 
for all lengths combined were 74.47 for con- 
ventional, 80.22 for spaced style, 80.75 for 
square span style, and 86.50 for vertical style. 


TABLE 3 
ANALYSIS OF VARIANCE FOR OVERALL COMPARISON 
OF STYLES 


Sources 


Between subjects 

Interaction: Styles X Sets of sentences 

377.3 
639.5 


(between) 
Error (between) 


Within subjects 
Styles 869.6 
Sets of sentences 394.3 
Interaction: Styles X Sets of sen- 

tences (within) 142 
Error (within) 
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Table 3 presents the analysis of variance for 
the latin square design. 

The F ratio for differences of styles is sig- 
nificant beyond the .01 level. (The difference 
between sets of sentences and the interaction 
between styles and sets of sentences is not 
significant.) 

By a ¢ test, all three new styles were sig- 
nificantly superior to the conventional style 
beyond the .01 level. 


DIscuSsSsION 
Spaced 


There have now been some six or seven 
comparisons of spaced and unspaced (Lieber’s 
arrangement is also a technique for spacing). 
Almost all of them have favored the spaced 
version, but only two of them, the one by 
North and Jenkins and the one by Kim were 
significant. Although suggestive. the Japanese 
study on spacing by Sato and Kusajima 
should not be directly compared to the Eng- 
lish studies. Japanese has three different kinds 
of syllabary. The alternation of these kinds of 
characters are organization cues similar to the 
spaces between English words. 

When all the studies are considered, they 
probably argue for a slight advantage in favor 
of spaced even with untrained subjects. Per- 
haps the effect is slight and variable because 
no one has yet used the optimum arrange- 
ment. Before he can divine the most effective 
way to group phrases, an investigator prob- 
ably needs a considerable knowledge of con- 
stituent analysis and American patterns of 
stress, juncture, and intonation. The present 
investigators, at least, are rather deficient in 
such linguistic training. So far, there have 
been no photographs of eye movements while 
reading the experimental styles. Such photo- 
graphs might give insights into more effective 
ways to group phrases. 

On the other hand, the effect may be slight 
and variable only because established reading 
habits are interfering with the cues from spac- 
ing. In the unfamiliar tachistoscope presenta- 
tion, the effect was significant and fairly large. 


Vertical 


In the tachistoscope series, the best vertical 
style of typography was definitely superior to 
the best of the other three styles. Studies with 
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Chinese and Japanese characters, in which 
horizontal seems to be better than vertical, if 
not in comprehension then at least in speed, 
appear to contradict the results of this study. 
A possible explanation for this discrepancy in 
results may be in differences in the arrange- 
ments of characters. In the vertical arrange- 
ment in oriental languages, the characters are 
printed one above another so that the lines of 
print are long, narrow strings quite similar to 
the long, narrow strings in the horizontal ar- 
rangement. It would be somewhat analogous 
to English written in the following style. 
I 


In oriental languages, their vertical style 
wastes horizontal eyespan just as their (and 
our) horizontal style wastes vertical eyespan. 

When the subjects read long passages, there 
was no significant difference between vertical 
and conventional so far as number of ques- 
tions answered was concerned. But in terms 
of reading speed, the results directly contra- 
dict those with the tachistoscope: vertical was 
read significantly slower than conventional. 

The contradiction can probably be ex- 
plained by reading habits. For years, the sub- 
ject had been reading material in which the 
line above was not immediately related to the 
words that would come next. Similarly, be- 
cause the line below was unrelated, by the 
time he reached it he had forgotten that he 
had half-perceived these words before. Far 
from ever using these peripheral cues, he had 
to purposely suppress them for the words to 
make connected sense. 

In the novel tachistoscope situation, the 
contrary habits of suppression apparently did 
not interfere as much as when long passages 
were read. If this explanation is correct, given 
some training the reader might become able to 
exploit the additional eyespan to some ad- 
vantage. He might learn not to suppress the 
cues from the line above and the line below 
his fixation. 

Other explanations for the discrepancy be- 
tween the tachistoscope and the reading ex- 


periments are possible of course. Perhaps the 
tachistoscope presentation adds some un- 
known constraint that favors the vertical 
style. Or the discrepancy may have been 
caused by a scoring difference. Perhaps con- 
ventional style is so easy to read that the sub- 
ject gets the meaning but fails to report the 
right word (which is what was scored here). 
However, a study by Tinker (1955) shows 
that readers do make rapid improvement in 
learning to read the vertical style. When all 
is considered, the tachistoscope results seem 
promising enough to justify some further ex- 
perimentation with trained subjects. 


SUMMARY 


Five styles of typography—spaced units, 
vertical, square span, an arrangement of one 
phrase per line, and conventional—were com- 
pared using untrained subjects. This paper re- 
ports two fairly independent series of experi- 
ments: a series using a tachistoscope pres- 
entation, and a series in which the subjects 
read passages of about 1500 words. 

In the familiar situation of reading long 
passages, the subjects were apparently unable 
to suppress established reading habits that in- 
terfered with the new styles. Conventional was 
read significantly faster than the vertical ar- 
rangement. However in terms of comprehen- 
sion (total number of questions answered), 
conventional showed only a slight, nonsignifi- 
cant advantage. When it was compared to 
spaced, conventional showed an equally slight 
and nonsignificant disadvantage. 

But in the tachistoscope series, in which 
the subjects were reading in an unfamiliar 
situation, three experimental styles were sig- 
nificantly superior to conventional. Vertical, 
spaced, and square span were all significantly 
superior, vertical being the most superior. 

The tachistoscope series suggests that there 
are advantages to the new arrangements, but 
the reading series suggests that subjects must 
be trained to read these new arrangements 
before the advantage will be fully realized. 
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COMPARISON OF PERFORMANCE ON MANUAL 
AND ELECTRIC TYPEWRITERS 


ROBERT C. DROEGE ann BEATRICE M. HILL 


United States Employment Service 


The electric typewriter is gaining greater 
popularity in schools and business offices to- 
day and is often used to the exclusion of the 
manual typewriter. Along with this wider use 
of the electric typewriter has come a growing 
demand for persons qualified to operate it. 
Although the electric typewriter is becoming 
more popular, there are still many typists who 
have had experience only on a manual type- 
writer. How well do these individuals perform 
when they switch over to an electric type- 
writer? Can their performance be predicted 
with any degree of accuracy? The specific pur- 
pose of this study was to compare perform- 
ance on the electric typewriter and manual 
typewriter and investigate the possibility of 
developing conversion tables which would en- 
able the prediction of performance on an elec- 
tric typewriter from performance on a manual 
typewriter. 

METHOD 


The sample consisted of 575 individuals with at 
least 6 months of experience on the electric type- 
writer. Of those individuals originally tested for the 
study 38 were dropped from the sample because they 
made an excessive number of errors (more than 60) 
on either the manual typewriter or the electric type- 
writer. It was obvious after examining the papers of 
individuals who made more than 60 errors in the 10- 
minute time limit, that these individuals could not 
meet the minimum standard of proficiency for typ- 
ing jobs. It was not even necessary to obtain a for- 
mal total of the number of errors made to reach 
this conclusion. With the elimination of the obvi- 
ously unqualified, the results become more applicable 
to those with at least a minimum of typing profi- 
ciency. The final sample consisted of individuals from 
seven states. The N in the various states ranged from 
33 to 105. All individuals were either Employment 
Service applicants or employed typists who volun- 
teered for the testing. 

All persons in the sample were tested initially on 
the electric typewriter with USES Typing Test Form 
No. 6 and retested on the manual typewriter with 
USES Typing Test Form No. 7. In most cases the 
two tests were given on the same date but in no 
case did the period of time between testings exceed 
2 weeks. The testing took place during the period of 
April 19, 1957 to January 13, 1958. The tests were 
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administered either at local Employment Service of- 
fices or at the employed workers’ stations of work 
by Employment Service personnel, according to the 
instructions contained in the USES Guide to the Use 
of Typing, Dictation, and Spelling Tests. 


RESULTS 


Table 1 shows the ranges, means, and stand- 
ard deviations of words-per-minute (wpm) 
and error scores for the total sample. Table 2 
shows the correlations between scores on the 
manual and electric typewriter, the differences 
in mean scores and standard deviations of 
scores on the two typewriters, and the ¢ ratios 
corresponding to the differences for the total 
sample. 

The following points are based on the re- 
sults shown in Tables 1 and 2: 

1. There is a substantial relationship be- 
tween wpm scores obtained on manual and 
electric typewriters but a lower relationship 
between error scores on the two typewriters. 

2. The 9.17 difference between mean wpm 
scores is significant at the .01 level, indicat- 
ing an advantage in favor of the electric type- 
writer. 

3. The 1.70 difference in standard devia- 
tions of wpm scores on the two typewriters is 


TABLE 1 
RANGES, MEANS, AND STANDARD DEVIATIONS OF 
WoRDs-PER-MINUTE AND ERROR SCORES ON THE 
ELECTRIC AND MANUAL TYPEWRITER FOR 
THE TOTAL SAMPLE 


(N=575) 
Electric Manual 

WPM 

R 36-101 28-86 

M 65.28 56.11 

SD 11.22 9.51 
Errors 

R 0-57 0-59 

M 14.80 16.93 

SD 10.19 11.90 
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Manual and Electric Typewriters 


TABLE 2 


PropucTt-MOMENT CORRELATIONS BETWEEN SCORES 
ON THE MANUAL AND ELectTric TYPEWRITER (r), 
DIFFERENCES IN MEAN SCORES AND STANDARD 
DEVIATIONS OF SCORES ON THE Two TyPE- 
WRITERS (D), AND ¢ RATIOS CORRE- 
SPONDING TO THESE DIFFERENCES 
FOR THE TOTAL SAMPLE 


(N=575) 


Difference in 
Differences in Standard 
Means Deviations 


WPM 


Errors 


significant at the .01 level, indicating a greater 
degree of variability in wpm typed on the 
electric typewriter than on the manual type- 
writer. 

4. The 2.14 difference between mean error 
scores is significant at the .01 level, indicat- 
ing that fewer errors are made when the elec- 
tric typewriter is used. 

5. The 1.71 difference in standard devia- 


tions of error scores on the two typewriters is 


significant at the .O1 level, indicating less 
variability in error scores made on the elec- 
tric typewriter than on the manual type- 
writer. 

DISCUSSION 


The size of the relationship between scores 
on the manual and electric typewriter is a 
function of the reliability of the test and the 
effect of differences associated with operation 
of the two kinds of typewriters. To get some 
idea of the relative importance of these two 
factors, it is necessary to compare the corre- 
lations obtained in this study with estimates 
of repeat reliability of the test taken both 
times on the same kind of typewriter. Such 
estimates of the reliability of USES Typing 
Test Form No. 6 are available from a study 
by J. R. Cook (unpublished). The sample for 
the study consisted of 225 Iowa State Em- 
ployment Service applicants. They were tested 
in 1958 with Typing Test Form No. 6 on 
manual typewriters and then retested (after a 
short break) with the same form on the same 
typewriters. The test was administered with a 
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5-minute time limit instead of the 10-minute 
time limit used in the present manual-electric 
typewriter study. The results showed that 
the correlation between initial and retest 
scores was .97 for words per minute and .85 
for errors. The differences between these cor- 
relations and those in the present study (.76 
for words per minute and .62 for errors) are 
significant at the .01 level, indicating that 
differences associated with operation of man- 
ual and electric typewriters are of some 
importance. 

It may be concluded that an assessment of 
performance on the manual typewriter does 
not provide a completely satisfactory basis 
for predicting performance on the electric 
typewriter. While it is true that the correla- 
tion between wpm scores (r = .76) is sub- 
stantial enough to permit a fairly satisfactory 
prediction of speed qualifications on the elec- 
tric typewriter, the relationship between 
error scores (r = .62) is not high enough to 
permit satisfactory prediction of accuracy 
qualifications on the electric typewriter. 

The difference in mean wpm scores and 
error attained on the two kinds of 
typewriters might have been smaller if more 
stringent controls had been placed on the 
experience requirements for the examinees. 
It will be remembered that all examinees were 
required to have at least 6 months of experi- 
ence on the electric typewriter, but there was 
no minimum amount of experience required 
on the manual typewriter. It is quite possible 
that if the amount and recency of experience 
on both kinds of typewriters had been equated, 
the scores obtained on the electric and manual 
typewriters would have been more in line 
with each other. Further, the effect of changes 
in emotional attitude occasioned by the shift 
from the electric typewriter (to which the 
examinees were more acustomed) to the 
manual typewriter should not be overlooked 
as a factor affecting both speed and accuracy. 
While the extent to which emotional factors 
affected performance in this study is un- 
known, these factors very likely did have some 
influence on mean scores and correlations be- 
tween scores of individuals in the sample. 
Difficulty in adapting to the typing touch of 
the manual typewriter after using the electric 
typewriter is another possible factor which 
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may have affected the results in this study. 
Various specific difficulties typists have in ad- 
justing to a change from an electric to a 
manual typewriter have been indicated by 
students in a study of training on manual and 
electric typewriters (Savage, 1953). 

The United States Employment Service is 
developing a new set of typing tests based 
on current material used in busines and in- 
dustry. Because of the results obtained in this 
study, it has been decided that these new 
tests should be standardized separately for 
manual and electric typewriter operators. 


Once norms for the electric typewriter are 
established, applicants for positions calling 
for qualified electric typewriter operators can 
be tested on an electric typewriter and their 
scores can be evaluated against norms based 
on performance of a representative sample of 
electric typewriter operators. 


SUMMARY 


The possibility of developing conversion 
tables which would enable prediction of 
performance on an electric typewriter from 
performance on a manual typewriter was in- 


Robert C. Droege and Beatrice M. Hill 


vestigated. The sample consisted of 575 ex- 
perienced electric typewriter operators from 
seven states. They were tested initially on an 
electric typewriter and then retested on a 
manual typewriter. The tests used were equiv- 
alent forms of the United States Employment 
Service Typing Test. The results showed that 
there was a substantial relationship (r = .76) 
between words-per-minute scores on manual 
and electric typewriters but only a moderate 
relationship (ry = .62) between error scores 
on the two typewriters. A further analysis of 
the data showed that, on the average, indi- 
viduals in the sample typed about nine words 
per minute faster and made about two fewer 
errors on the electric typewriter than on the 
manual typewriter. These differences were 
statistically significant. The standard devia- 
tions of words per minute scores and error 
scores for the two typewriters were also sig- 
nificantly different. 
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AN AMERICAN APPLICATION OF EYSENCK’S SHORT 
NEUROTICISM AND EXTRAVERSION SCALES 


WILLIAM D. WELLS, HOWARD E. EGETH, ann NANCY P. WRAY 


Newark Colleges, Rutgers University 


In 1958 Eysenck published an article de- 
scribing two short personality scales suitable 
for use in market research interviewing 
(Eysenck, 1958). He reported that the scales 
gave reliable measurements on two independ- 
ent personality dimensions, and that there 
were significant differences along these di- 
mensions among certain segments of the Eng- 
lish consumer population. It has recently been 
possible to use Eysenck’s scales in some stud- 
ies of a group of American housewives, and 
the comparisons between his results and ours 
seem worth noting. 


SUBJECTS AND METHOD 


In comparing Eysenck’s data with data 
from the present study, it is important to keep 
in mind some major differences between the 
two sets of respondents. Eysenck’s 1,600 sub- 
jects were drawn from a nationwide sample 
of the English consumer population. His sam- 
pling procedure insured proper proportions of 
urban and rural residents, and proportional 
representation of the English population in 
terms of economic class, sex, and age. The 
180 subjects in our study were middle and 
upper-middle class housewives living in the 
metropolitan area surrounding Newark, New 
Jersey. They were considerably better edu- 
cated than the American national average 
(26°% had at least some college) and they 
were primarily of Jewish extraction. Their 
responses to Eysenck’s scales were collected 
during tryout of a number of questionnaire 
items in a pilot investigation. Because of the 
preliminary nature of the work, we had made 
no attempt to draw a representative sample 
of a defined population. 


RESULTS 


In spite of the differences between sets of 
respondents, it seemed worthwhile to find out 
whether our subjects differed significantly 
from Eysenck’s subjects on the neuroticism or 


extraversion dimensions. They did, on both 
(Table 1). The women we interviewed were 
significantly Jess neurotic and_ significantly 
less extraverted than the women in Eysenck’s 
sample. Although the inadequacy of our 
sample precludes any general comparison of 
English women vs. American women, it is 
perhaps comforting to note that the person- 
ality test scores of at least one group of Ameri- 
can women run counter to opinions expressed 
by some observers of the American scene. 

Of more general interest are the results ob- 
tained from analysis of the scales themselves. 
Eyvsenck reported a correlation of -.05 be- 
tween his neuroticism scale and his extraver- 
sion scale. This lack of correlation seemed 
puzzling in view of the fact that maximum 
scores on both of Eysenck’s scales are ob- 
tained by answering “yes” to all questions. 
Recently published research on “agreeing 
response set” (Couch & Kenniston, 1960) 
indicates that scales scored in this way almost 
ilways correlate at least moderately. 

In our data, the correlation between Ey- 
senck’s neuroticism scale and his extraversion 
scale was —.08. This finding confirms the fact 
that the two scales are uncorrelated, despite 


rABLE 1 
COMPARISON OF AMERICAN AND ENGLISH RESPONDENTS 
ON EYSENCK’S NEUROTICISM AND 
EXTRAVERSION SCALES 


Neuroticism Scale Extraversion Scale 


Statistic American English American English 
VW — 04 1.008 92 1.71° 
SD 3. 3.42 2.96 2.97 
N 8 180 800 
t 3. 3.24 
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TABLE 2 
COMPARISON OF SMOKERS AND NONSMOKERS ON EySENCK’s SCALES AND ON AGREEING RESPONSE ITEMS 


Neuroticism 


Extraversion Agreeing Response 


Statistic Smokers Nonsmokers 


Smokers Nonsmokers 


Smokers Nonsmokers 


M —.12 08 
SD 3.06 3.88 
N 107 73 

t 37 


* Significant beyond .01 level. 


the potential mutual variance introduced by 
the scoring system. Some information on the 
influence of the scoring system is provided 
by correlations between Eysenck’s scales and 
a scale of 20 items found by Couch and Ken- 
niston to be highly loaded on the agreeing 
response set dimension.' Eysenck’s neuroti- 
cism scale correlated .35 with these items, 
and his extraversion scale correlated .28 with 
them. With an N or 180, both these correla- 
tions are significant well beyond the .01 level. 
Further understanding of the scales can be 
gained from an examination of their reliabili- 
ties. Eysenck reported corrected split-half re- 
liabilities of .79 for the neuroticism scale and 
.71 for the extraversion scale. In our data the 
corrected split-half reliability for the neuroti- 
cism scale held up reasonably well—it was .72. 
But for extraversion the corrected split-half 
reliability was a disappointing .41. For the 
Couch and Kennitson items it was .62. 
Fitting these findings together, it appears 
that both the neuroticism scale and the extra- 
verson scale are significantly related to agree- 
ing response set, but this relationship is not 
strong enough to force the scales into corre- 
lation with each other. Because the reliabili- 
ties of all three measures are only middling, 
it is possible that both neuroticism and extra- 
version might turn out to be more strongly 
related to agreeing response set if the measure- 
ments could be made more accurately. 
Eysenck reported significant age and social 
class differences on the neuroticism scale, the 
lower class and younger age groups being 
slightly more unstable emotionally. In our 
1 The items were the 19 items in Couch and Ken- 
niston’s Table 9 plus the sixth item in their Table 8. 


80.2 76.6 
11.1 11.9 
107 73 


data the difference was in the same direction 
for age, and in the opposite direction for class. 
Both differences were small and neither was 
statistically significant. 

Eysenck also reported a significant differ- 
ence on his extraversion scale between 
“drinkers” and “nondrinkers.” We did not 
ask about drinking habits, but we did ask 
about smoking. The smokers among our re- 
spondents did not differ significantly from the 
nonsmokers in either neuroticism or extra- 
version (Table 2). They did differ signifi- 
cantly on the agreeing response set dimension. 
To the extent that our sample is representa- 
tive, women who smoke appear to be slightly 
more likely to say “yes.” 


SUMMARY 


Eysenck’s short neuroticism and extraver- 
sion scales were used in interviews with 180 
American housewives. The scales proved to 
be uncorrelated with each other, even though 
both were significantly correlated with a meas- 
ure of agreeing response set. The reliability 
of the neuroticism scale was .72; and of the 
extraversion scale, .41. Some differences be- 
tween the present results and Eysenck’s re- 
sults were discussed. 
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“REAL-LIFE” FAKING ON THE STRONG VOCATIONAL 
INTEREST BLANK BY SALES APPLICANTS 


WAYNE K. KIRCHNER 


Minnesota Mining and Manufacturing Company 


The Strong Vocational Interest Blank can 
be faked. Studies by Longstaff (1948), Geh- 
man (1957), Benton and Kornhauser (1948), 
Bordin (1943), and Strong (1943), among 
others, have shown that persons when directed 
to fake the SVIB can do so rather effectively. 
Longstaff has shown that the Strong can be 
faked upward fairly easily. Gehman has shown 
that engineering students can look like social 
service type persons if directed to do so. 
Benton and Kornhauser, too, have shown 
that medical interests can be faked. In gen- 
eral, then, “faking” can be accomplished on 
the SVIB if the subjects are asked to do this. 

All of these studies, however, seem lacking 
‘in “real-life’’ motivation. As far as can be 
determined in these studies, the subjects all 
were students who were asked to fake the 
test. This approach is, of course, worthwhile 
but it does not get at faking in more natural 
situations such as selection. It would seem 
worthwhile to investigate the responses of 
persons mot directed to fake but who are tak- 
ing the test under conditions where it is to 
their advantage to “look good.” The prime 
example of this, of course, would be the job 
applicant completing the SVIB as part of the 
selection process. 

This study is an attempt to throw some 
light on possible faking of the SVIB in a real- 
life situation by analyzing responses made by 
sales applicants (later hired) with those of 
presently employed salesmen. 


METHOD 


As part of a lengthy follow-up study in 
1960 of 1957 sales applicants who became 
salesmen in a large, midwestern company, 
Strong Vocational Interest Blank data was 
reviewed for the total sample of 258 such ap- 
plicants. These persons all had completed the 
Strong prior to being hired as part of the 
selection process. 


From this original group, two subgroups 
of 92 Retail applicants and 64 Industrial 
applicants, respectively, were obtained on the 
basis of later job duties. In effect, this pro- 
duced fairly “pure” groups of persons en- 
gaged strictly in Retail selling or strictly in 
Industrial selling. This division was deemed 
necessary for a myriad of studies, includ- 
ing those of Dunnette and Kirchner (1960), 
Witkin (1956), and Hughes and McNamara 
(1958), have shown that there are different 
kinds of salesmen with different interest pat- 
terns. For example, Dunnette and Kirchner 
have found that salesmen engaged in Retail 
selling are different from their Industrial 
counterparts on personality measures and on 
the SVIB, with Retail salesmen tending to be 
more like the traditional stereotype of the 
salesman. 

These differences could obscure or confound 
tendencies toward faking; hence, the split was 
made into the two basic categories of sales- 
men found in this company. 

Following this, SVIB occupational scale 
responses for the applicant groups were 
compared with those of presently employed 
company salesmen engaged in Retail or In- 
dustrial selling. These salesmen groups (Re- 
tail, NV = 68; Industrial, V = 49) were part 
of a random sample of 196 salesmen from a 
total group of over 700 who took the SVIB 
and other tests on a volunteer basis as part 
of a concurrent validity study. Each sales- 
man in the two comparison groups had a 
minimum of 5 years’ sales experience in the 
company. The division in terms of job duties 
was accomplished by using the Sales Job De- 
scription Checklist (Dunnette & Kirchner. 
1958). Summing up then, SVIB responses for 
applicants who later became Retail and In- 
dustrial salesmen were compared with those 
of veteran Retail and Industrial salesmen. 

Scoring of the SVIB profiles was done by 
converting scores on each occupational 
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TABLE 1 


MEANS AND STANDARD DevIATIONS ON 48 StRONG VOCATIONAL INTEREST BLANK SCALES FoR RETAIL 
INDUSTRIAL SALESMEN AND SALES APPLICANTS 


Retail 
\pplicants Mean Difference r Applican Mean Difference 
V =92) 
- »plicant- - (Applicant 
alesmen) Salesmen) 


iculture 


hys. Director 
irector 


74 
97 


inistrator 
*tary 
al Science Teacher 
City School Supt. 
Social Worker 
Minister 


Vist 


Musician 


CPA 

Senior CPA 
\ccountant 
Office Man 

I sing Agent 


Sales Manager 
Real Estate Salesman 
Life Insurance Salesman 


Advertising Man 
Lawyer 
Author-Journalist 


Pres.-Manufacturing Concern 
Interest Maturity 

Occup. Level 
Masculine-Feminine 


*k is the mean difference between the two groups expressed in terms of the standard error of the difference : 
Mi-M; 
7 


— +— 
\ Ni No 


All k values of 1.96 or greater are Statistically significant at the .05 level of probability. 
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Artist 1.79 .77 1.36 .70 —.43 3.62 1.67 .72 1.50 81 —.17 1.32 
Psychologist 1.74 .87 1.99 .25 1.85 2.35 2.56 .94 19 1.07 
Architect 1.38 1.23 82.88 —.56 3.19 1.47.94 1.16 1.06 —.31 1.64 
Physician 1.97 86 1.60 84 —.37 2.27 1.03 2.19 —.08 45 
Osteopath 2.99 (98 2.65 .78 ~.34 2.36 3.10 .95 2.89 (85 24 1.22 
th Dentist 1.84 80 1.41 67 —.47 3.93 1.86 .90 1.72 80 -.14 1.06 Ki 
Veterinarian 2.84 .92 2.13 1.15 —.71 4.41 2.61 2.35 .76 6 1.63 
Mathematician 50.86 1.42 (83 53.71 09 
Physicist 17.50 —.11 1.26 1.02 ~.20 1.10 
Engineer 2.16 2.03 .49 —.13 1.28 2.65 .92 2.16 .72 -.49 3.08 
Cheinist 1.69 143 .82 —.26 1,93 2.06.31 2.02 1.00 ~.04 95 
Production Manager 3.56 3.76 .72 20 1.58 3.96 .76 3.77 87 19 1.24 
Farmer 3.16 61 2.93 —.23 2.22 3.27 .69 3.20 $9 —.07 60 
ate Aviator 3.37 80 3.03 80 —.34 2.66 3.74 78 3.44 78 x0 2.03 a 
Carpenter 2.15 .36 1.99 1.01 16 1.10 2.39 1.07 2.31 o1 —.08 +2 
Printer 3.07 4 3.13 81 .06 45 3.25 96 3.48 83 23 1.34 
Mathematics Physics 
Science Teacher 3.13 1.07 —.11 79 3.53 3.44 1.10 ~.09 49 
Roe Industrial Arts Teacher 1.44 1.18 1.36 1.17 —.08 43 2.02 1.13 1.84 1.06 —.18 86 ake 
| \ tional Agr 
| Teacher 2.66 2.63 .98 —.03 20 2.80 2.88 .89 08 15 

Policeman 3.50 3.61 76 ll 87 3.55 78 3.66 78 11 74 

Forest Service Man 2.56 .88 2.12 1.01 —.44 2.94 2.76 2.53 .78 —.23 1.51 

YMCA 3.53 1.08 3.81 .28 3.53 1.13 3.85 32 1.65 
4.22 82 4.40 .22 | 4.43 81 4.33 86 -.10 65 
3.81.77 «4.05.75 24 410 75 419 77 09 63 
3.24 86 366 .76 42 3.14 90 3.66 52 3.19 
3.69 (88 4.11 .79 A2 3.55 88 3.98 95 $3 2.49 
2.56 27 2.78 2 2.53 2.78 25 1.69 
3.62 .91 3.87 25 3.76 3.98 22 1.23 
2.10 .96 2.32 .81 .22 2.18 .98 2.41 .83 23 1.32 

2.82 .72 2.86 .81 04 33 2.94 74 3.25 92 31 1.98 
2.38 .71 2.64 .26 1.98 2.48 .87 13 86 
3.46 85 3.36 .89 40 2.88 382 4.05 .71 .23 1.45 
2.97 .76 3.69 84 72 5.66 3.08 20 3.52 1.02 44 2.43 
4.22 .72 442 .65 20 «1.81 400 432. 322.8 
3.63 64 3.87 .71 24 2.24 3.39 88 3.65 84 26 1.59 : 
Banke 2.90 .71 3.13 85 23 1.86 2.71 76 2.91 381 .20 1.35 
ath Mortician 4.29 83 442 .62 13 1.09 3.96 8410 .72 14 .89 
Pharmacist 3.84 82 3.96 67 12 99 3.49 3.92 90 43 
ad 
5.02 .77 8964.80.36 —.22 2.19 4.71 67 1.66 46 —.05 45 
4.94 80 471 —,23 2.18 4.67 .80 459 08 6? 
4.97 .82 4.52 .60 —.45 3.83 4.47 91 4.33 65 —.14 o1 
oh 
nee ti, 3.99 76 3.84 67 —.15 1.30 3.82 75 3.73 .64 —.09 67 
3.00.69 3.04 04 36 2.96 2.98 .69 02 16 
2.82 .61 2.65 .69 —.17 1.65 2.71 .64 2.50 .66 —.21 1.70 

3.25 3.38 13 1.16 3.35 85 3.33.63 02 
5.15 4.92 32 23 2.95 5.27 .48 4.94 (13 33 4.68 
§.25 .63 4.71 .45 54 6.02 5.10 .79 4.71 .44 39 3.11 
434 80 426 64 68 4.74 80 458 .53 16 1.21 
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Faking on the SVIB 


scale into an arbitrary eight-step scale (pri- 
marily to avoid negatives) as follows: 
SVIB T SCORE 


(Occupational Scales) 
10 


EIGHT-STEP SCALE 
to +4 
to 14 
to 24 
to 34 
to 44 
to 54 
to 64 
to 74 


M-F, I-M, OL scales were converted from 
T scores to a seven-step scale as follows: 
10 to 19, 1; 20 to 29, 2 etc. 

Mean scores and standard deviations were 
computed for the four groups, Retail Sales- 
men, Retail Applicants, Industrial Salesmen, 
and Industrial Applicants, and mean differ- 
ences and & values were also compiled. 

The critical assumption behind all these 
comparisons was that presently employed 
salesmen with over 5 years’ experience had a 
very small “axe to grind” and were fairly 
honest in their answers while the applicants 
striving to get a job were more prone to look 
good. Thus, the applicant group was assumed 
to lean more toward faking and differences 
between the two groups on the SVIB were 
hypothesized to be the result of faking 
tendencies. 
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RESULTS AND DISCUSSION 


Basic comparisons and results are shown 
in Table 1. 

Two results are readily apparent. First, 
there are many and consistent differences 
between SVIB scales scores of applicants and 
salesmen. Of the 96 mean differences shown, 
32 are statistically significant at the 5% 
level of probability or better. Second, appli- 
cants tend to score higher in Business occu- 
pations and Social Service jobs and lower in 
Scientific and Technical occupations and 
markedly higher in the Occupational Level 
scale. It also appears that applicants were 
lower on Sales keys on the SVIB than the 
sales groups. These results suggest then that 
applicants who are assumed to be leaning 
forward on their SVIB responses to look good 
are not too good at “upping” their scores on 


by Sales Applicants 


the Sales keys. Instead, they “boost’’ their 
scores in Business, Social Service, and Person- 
nel areas plus showing a stronger professional 
orientation (higher OL). 

Why does this occur? The answer in this 
case seems to be that applicants in filling out 
the SVIB answer sheet are shying away from 
“dislike” answers. They tend to be giving 
more “like” and “indifferent’’ answers. As 
Berdie (1943) has shown and which anyone 
can verify for himself by completing three 
SVIB answer sheets with all “likes,” all “in- 
differents,” and all “dislikes,” like responses 
are associated with high scores in Social Serv- 
ice (Group V) activities and in Business De- 
tail jobs. Likewise, indifferent responses are 
helpful in getting higher Social Service scores. 
Dislike responses boost technical and scientific 
scores. 

One hypothesis that might explain this is 
that applicants try to answer in the most so- 
cially acceptable fashion. This could mean 
showing a liking for most things. Veteran 
salesmen, on the other hand, should not have 
as great a need to give socially acceptable 
answers and can “confess” they dislike a few 
things. 

In this vein, it is interesting to note that 
in a study by Dunnette, Kirchner, and De- 
Gidio (1958) the Strong area that correlated 
highest with the Good Impression scale on the 
California Psychological Inventory was the 
Social Service area. 

“Real-life” faking for sales applicants at 
least as shown in this study, then, boils down 
to showing a liking for many things which 
may be the socially desirable way to complete 
the SVIB. Unfortunately, this does not help 
the applicant much in boosting his Sales key 
scores but strongly affects other areas. It may 
reflect a naive kind of test taking behavior on 
the part of the applicant. 


SUMMARY 


Responses made on the Strong Vocational 
Interest Blank for 92 Retail and 64 Indus- 
trial sales applicants (later hired) as part of 
the selection procedure were compared with 
SVIB responses made by 68 Retail and 49 
Industrial salesmen employed at least 5 years 
who completed the SVIB voluntarily as part 
of a concurrent validity study. It was hy- 
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pothesized applicants would be trying to 
“look good”; salesmen would be more “hon- 
est.”” Of 96 mean differences on the 48 scales, 
32 were significant at the .05 level. Applicants 
tended to be higher in both Retail and Indus- 
trial settings in Social Service and Business 
occupations and lower in Technical, Scientific, 
and surprisingly, Sales. Apparently, applicants 
indicate a greater liking for things than do 
employed salesmen, which suggests the idea of 
completing the SVIB in the most socially 
acceptable fashion: i.e., liking much, disliking 
little. 
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ORGANIZATION AND CREATIVE PROBLEM 
SOLVING * 
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The Western Electric studies (Mayo, 1933; 
Roethlisberger & Dickson, 1939) were the 
first of an ever-increasing number which have 
pointed to the tremendous motivational po- 
tential which could be unleashed if organiza- 
tions attended to workers’ needs. More re- 
cently social scientists and businessmen have 
suggested that people in lower levels of or- 
ganizations have problem solving capabilities 
which also have been relatively unused (Mc- 
Gregor, 1960; March & Simon, 1958; Worthy, 
1959). Because most companies follow tradi- 
tional organizational theory and practice, de- 
cision making authority is retained at higher 
levels of the organization. People at lower 
levels rarely are asked to make decisions and 
become accustomed to accepting orders from 
above. 

At every level of management the superior 
frequently makes decisions about subordinate 
behavior which he feels are necessary to 
attain some organizational objective. Fre- 
quently, neither the subordinates’ feelings 
about the decision nor the subordinates’ in- 
formation, knowledge, or skills relevant to the 
decision are considered. The ignoring of sub- 
ordinates’ feelings frequently results in a 
poorly motivated acceptance of the decision. 
Failing to consider the subordinates’ informa- 
tion, knowledge, and skills often results in 
forcing subordinate acquiescence to a truly 
poor decision. 

People familiar with the operations of large- 
scale organizations will have no difficulty re- 
calling instances where poor decisions have 
been carried out because higher management 
insisted that they be done, even though the 
subordinates knew the action was doomed to 
failure. To what extent are organizations 
which operate under the classical management 


1 This investigation was supported by a USPHS 
research grant (M-2704) from the National Institute 


of Mental 
Service. 


Health, United States Public Health 


philosophy failing to use the human creative 
potential which is available to them? 

This paper reports one answer to this ques- 
tion by comparing the frequency of creative 
problem solutions obtained from groups which 
varied in their amount of experience and 
identification with existing organizations, es- 
pecially in business and industry. 


METHOD 
Problem 


The role playing case, the Change of Work Pro- 
cedure problem (Maier, 1955), was used to test for 
creative problem solving. Although the case is de- 
scribed completely in the cited reference, a summary 
of its more relevant characteristics will be given here. 

A foreman attempts in a group meeting to con- 
vince three workers to change from their present 
work method (Old solution) to a work method 
recommended by a time-study man (New solution) 
When the foreman decides “to take up the problem 
with the men,” a conflict arises between the economic 
advantages of each man working only on his best 
position, as against the relief from boredom pres- 
ently gained by rotating hourly among the three 
positions. In most groups the foremen try to convince 
the workers of the advantages of the New work 
method and, after the relative merits of the Old 
and New methods are discussed, the workers accept 
(most do) or reject the New method. Occasionally 
a group will break out of this choice situation and 
develop an alternative work method which both 
exploits the abilities of the workers to do different 
jobs and avoids the boredom resulting from working 
on a repetitive job all day. These solutions have 
been called Integrative and, on the basis of previous 
studies using this case, appear to be a valid index 
of the creative problem solving ability of a group 
Integrative solutions have been produced more fre- 
quently: (a) by groups led by foremen trained in 
human relations as compared to untrained foremen 
(Maier, 1953), (b) by groups solving the problem 
a second time as compared to their first solution 
attempts (Maier & Hoffman, 1960), and (c) by 
groups composed of heterogeneous personalities as 
compared to groups of homogeneous personalities 
(Hoffman, 1958; Hoffman & Maier, 1961). The 
ability to turn a situation of choice between two 
work methods into a problem situation and develop 
a work method different from those that are obvious 
from the role instructions—a method which incor- 
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rABLE 1 


NS OF SOLUTIONS FROM Groups With DIFFERENT ORGANIZATIONAL IDENTIFICATION 


mustration students 


f Human Relations students 


Psvc hology students 


at the 
airline managers (10), training directors 


of both methods 
creative orientation to problems. The pro- 
rtion of Integrative solutions produced by each 
ot experimental groups will be compared 


suggests a 


nerally 


Subjects 


Su role- 


Subjects from four different populations 
plaved the case. The populations were selected to 
represent differing degrees of experience and/or 
identification with vocational career in large-scale 
organizations, especially industrial or- 


ganizations 


business or 


Organizationally Employed. Sixty-nine 
people presently employed in a_ variety of large 
organizations represent the population with the 
greatest vocational commitment, Except for 10 groups 
of airline managers who role-played the case early 
in a one-week training program, all other groups 
solved the problem during one-day conferences fol 
lowing an hour's The re 
maining 59 groups consisted of 11 groups of indus 
trial foremen.? 10 groups of industrial training 
directors, 7 groups of members of the administration 
of a hospital, and 31 groups of nursing supervisors 
Despite this diversity of organizational experience, 
the distributions of solutions from these several 
groups hardly differed at all. In addition to being 
currently employed, the members of groups 
tended to be somewhat older than subjects from the 
other three populations, 

Business Committed. Junior, senior, and graduate 
students in personnel administration courses at the 
School of Administration provided 28 
groups." Since most of the graduates of this school 
follow careers in one of the major corporations, 
these subjects may be presumed to have made some 
vocational commitment to working in large 
organizations. In addition most subjects had had con- 
siderable part-time, and quite often full-time, work 
experience, and may be considered to be familiar 


groups ol 


lecture on motivation 


these 


Business 


scale 
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these data is 


I'ype of Solution 
New Integrative Potal 

N 

00.1 

71.5 

42.0 

40.6 


11.6 
21.4 
42.0 
46.9 


100.0 
100.0 
100.0 
100.0 


O1 level of conhdence 
(10), hospital 


The case was 
in the “leadership” sec- 
two-thirds of the way 


with the realities of organizational life 
administered as an exercise 
tion of the course, about 
through the semester. 

Human Relations Interested. Fifty groups were 
obtained from students in two semesters of an 
undergraduate psychology course entitled Psychology 
of Human Relations, About half the students were 
sophomore, juniors, and seniors in the literary college, 
while the other half were enrolled in the various 
other schools of the university. Although the course is 
a first one in industrial psychology and a large pro- 
portion of the students had some work experience, 
they were much less committed than were the busi- 
administration students to a 
The problem was solved, in these classes, approxi 
mately midway in the semester, in connection with 
the topic of motivation 

Introductory Psychology. Thirty-two groups from 
an undergraduate 


ness business career 


introductory psychology course 
represent the population farthest removed from the 
business scene. Subjects in these groups were, typi 
cally, freshmen and sophomores from the literary 
college, having little or no previous work experience 
The case was role-plaved as an example of 


psychology applied to industry.” 


“social 


RESULTS 


The problem solving results, as shown in 
Table 1, 


reveal a consistent decrease in the 
proportion of Integrative solutions as one 
compares the groups with the least degree of 
identification and experience with business to 
groups with the most. The chi square value 
for this relationship is 21.90, significant at 
the .01 level of confidence. 

In addition to the general trend of the 
results, the patterns of solutions are also sug- 
westive. The results obtained from students 
in the two psychology courses are almost 
indistinguishable, although slightly favoring 
groups in the introductory course. Similarly, 
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the results from the presently employed and 
the business administration groups are very 
much alike. There is one important difference, 
however, between these two latter groups. A 
significantly higher proportion of Old solu- 
tions was obtained from the employed than 
from the business administration groups. 

A comparison of these two pairs of groups 
indicates an incisive break. Group-by-group 
comparisons show that the employed and the 
business administration groups as compared 
with the human relations and the introductory 
psychology groups produced significantly more 
New solutions and significantly fewer Inte- 
grative solutions.* 


DISCUSSION 


The results are straightforward, but many 
interpretations are possible. The possibility 
that subjects in the employed groups are less 
well educated than subjects in the student 
groups can probably be ruled out. A large 
number of the subjects in the employed 
groups probably had some college training. 
In any case, Maier (1953) has shown that 
even foremen of assembly-line operations are 
capable of producing Integrative solutions 


when they are trained in the techniques of 


group discussion. Integrative solutions are 
easily achieved once the orientation is towards 
problem solving rather than making choices. 

Past experience with the Change of Work 
Procedure problem suggests the importance of 
the subjects’ orientation towards group situa- 
tions in terms of problem solving opportuni- 
ties rather than authority relations. The prob- 
lem solving orientation seems conducive to 
arriving at Integrative solutions as both the 
foreman and workers freely contribute their 
ideas about the work situation. Concern with 
authority relations usually produces New or 
Old solutions, as the workers accept or reject 
the foreman’s authority to force them to 
change their work method. 

On the basis of this interpretation, the re- 
sults suggest that the human relations and 
the industrial psychology groups were more 
often characterized by problem solving discus- 

‘The only exception to this conclusion is the lack 
of significant difference between the proportions of 
New solutions produced by the employed and the 
introductory psychology groups. 
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sions, while the employed and the business 
administration groups evaluated the problem 
in terms of accepting or rejecting the fore- 
man’s authority. 

What accounts for this difference? Do the 
experiences of working in business and other 
organizational settings becloud all problems 
with the authority relations involved? Are the 
penalties for “bucking” authority in tradi- 
tional organizations so great that the unac- 
ceptability of a foreman’s suggestion can only 
be expressed by elaborating the merits of the 
status quo, rather than thinking of other 
alternatives? Do organizationally experienced 
people assume, when playing the foreman’s 
role, that their suggested solution should be 
accepted by the workers by virtue of the 
formal authority the foreman holds? The at- 
titude of the business administration students 
appears to have confirmed this view. They 
were even less likely than the presently em- 
ployed groups to resist the foreman’s sugges- 
tion and, in fact, usually went along willingly 
with the new suggestion. 

The results of this study provide suggestive 
empirical support for the proposition that 
the usual formal authority structure found in 
present day organizations tends to inhibit the 
expression of the creative potential of their 
members. Groups with little or no identifica- 
tion and experience in business produced more 
than three times as large a proportion of In- 
tegrative solutions as did the groups of pres- 
ently employed people, who have more fa- 
miliarity with the background of the problem. 
If this effect were to hold true for other prob- 
lems, organizations are failing to use the 
creative capability that they possess in their 
ranks. 

The high proportion of acceptance of the 
New method by the groups of business ad- 
ministration students raises another question 
which may cause business some difficulty in 
achieving creative problem solving among its 
employees. The question is this: Are people 
being attracted to industry who are able to 
work comfortably in the formal authority 
system and are willing to accept decisions 
from their bosses because “that is the right 
thing to do”? If the answer to this question 
is “yes” and the current practice of hiring 
business school graduates continues at its pres- 
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ent rate, business may be hiring too many 
“yes men.” The net result may be a still fur- 
ther decrease in the expression of creativity 
in industrial organizations. 

Awareness of the inhibiting effects on crea- 
tive problem solving of present-day adminis- 
trative practice is a first step toward more 
effective management. Adoption of the man- 
agement philosophy espoused by McGregor 
(1960) and Worthy (1959) and development 
of the attitudes and skills necessary to the im- 
plementation of this philosophy (Maier, 1952, 
1958) should follow. 


SUMMARY 


Groups from four populations differing in 
their amount of experience and identification 
with industrial vocation, were compared in 
their performances on the Change of Work 
Procedure problem. Arranged from most to 
least identified, there were 69 groups of people 
presently employed in large organizations, 28 
groups of business administration students, 50 
groups of students in a human relations 
course, and 32 groups of students from an in- 
troductory psychology course. 


The percentages of Integrative—creative— 
solutions to the problem were 11.6°¢ by the 
employed groups, 21.4% by the business ad- 
ministration groups, 42.0% by the human 
relations groups, and 46.9°% by the introduc- 
tory psychology groups. These differences are 


statistically significant at the .01 level of 
confidence. 


The results are interpreted as providing 


Norman R. F. Maier and L. Richard Hoffman 


support for the proposition that the formal 
authority relations in organizations inhibit 
creative problem solving. They also suggest 
that business may be attracting people who 
can work comfortably, but not creatively, in 
such formal authority systems. 
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